45
Modern Performance - SQL Server Joe Chang www.qdpma.com Jchang6 @ yahoo

Modern Performance - SQL Server

  • Upload
    zanta

  • View
    21

  • Download
    2

Embed Size (px)

DESCRIPTION

Modern Performance - SQL Server. Joe Chang www.qdpma.com Jchang6 @ yahoo. About Joe. SQL Server consultant since 1999 Query Optimizer execution plan cost formulas (2002) True cost structure of SQL plan operations (2003?) Database with distribution statistics only, no data 2004 - PowerPoint PPT Presentation

Citation preview

Page 1: Modern Performance - SQL Server

Modern Performance - SQL ServerJoe Chang

www.qdpma.com Jchang6 @ yahoo

Page 2: Modern Performance - SQL Server

About Joe• SQL Server consultant since 1999• Query Optimizer execution plan cost formulas (2002)• True cost structure of SQL plan operations (2003?)• Database with distribution statistics only, no data

2004• Decoding statblob/stats_stream

– writing your own statistics• Disk IO cost structure• Tools for system monitoring, execution plan analysis See ExecStats http://www.qdpma.com/ExecStats/SQLExecStats.htmlDownload: http://www.qdpma.com/ExecStatsZip.htmlBlog: http://sqlblog.com/blogs/joe_chang/default.aspx

Page 3: Modern Performance - SQL Server

Overview

• Why performance is still important today?– Brute force?• Yes, but …

• Special Topics• Automating data collections• SQL Server Engine– What developers/DBA need to know?

Page 4: Modern Performance - SQL Server

CPU & Memory 2001 versus 2012

2001 – 4 sockets, 4 coresPentium III Xeon, 900MHz 4-8GB memory?

Xeon MP 2002-4

2012 – 4 sockets, 8 cores each4 x 8 = 32 cores totalWestmere-EX 1TB (64x16GB)Sandy Bridge E5: 768GB (48 x 16GB), 15 cores in Xeon E7 v23TB (96 x 32GB)

FSB

PL2

P P P

MCH

QPI

QPI

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

MI

PCI-E

C1 C6C2 C5C3 C4

LLC

QPI

MIC7C0

QPI

DMI 2

PCI-EPCI-E

PCI-EPCI-EPCI-E

MI

PCI-E

C1 C6C2 C5C3 C4

LLC

QPI

MIC7C0

QPI

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

MI

PCI-E

C1 C6C2 C5C3 C4

LLC

QPI

MIC7C0

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

MI

PCI-E

C1 C6C2 C5C3 C4

LLC

QPI

MIC7C0

Each core today is more than 10x over PIII

_____ 2013 __ 2014 16GB $191 __ $18032GB $794 __ $650

Page 5: Modern Performance - SQL Server

CPU & Memory 2001 versus 2014

QPI

QPIDM

I 2

PCI-EPCI-E

PCI-EPCI-EPCI-E

QPI

QPI

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

PCI-EPCI-E

PCI-EPCI-EPCI-E

PCI-E

QPI

MI

PCI-E

MI

C1C2C3

C0

C4

C8C7C6

C9

C5

LLCC8C7C6

C9

C5QPI

MI

PCI-E

MI

C1C2C3

C0

C4

C8C7C6

C9

C5

LLCC8C7C6

C9

C5

QPI

MI

PCI-E

MI

C1C2C3

C0

C4

C8C7C6

C9

C5

LLCC8C7C6

C9

C5QPI

MI

PCI-E

MI

C1C2C3

C0

C4

C8C7C6

C9

C5

LLCC8C7C6

C9

C5

Xeon E7 v2 (Ivy Bridge)4 x 15 = 60 cores3TB (96 x 32GB) 24 DIMMs per socket (12 shown)

2001 – 4 sockets, 4 coresPentium III Xeon, 900MHz 4-8GB memory?

Xeon MP 2002-4

FSB

PL2

P P P

MCH

Each core today is more than 10x over Pentium III (700MHz?)

Mem___2013 __ 2014 16GB __ $191 __ $18032GB __ $794 __ $650

Page 6: Modern Performance - SQL Server

Intel E5 & E7 v2 (Ivy-Bridge)

PCH

DMI

x4 x4 x4 x4

MC

GFX

E3 v3

Page 7: Modern Performance - SQL Server

Storage 2001 versus 2012/13

PCIe x8

PCIe x8

PCIe x8

PCIe x8

PCIe x8

PCIe x4

IBRAID RAID RAIDRAID10GbE

QPI

QPI192 GB 192 GB

HDD HDD HDD HDD

SSD SSD SSD SSD

2001 100 x 10K HDD 125 IOPS each = 12.5K IOPSIO Bandwidth limited: 1.3GB/s (1/3 memory bandwidth)

201364 SSDs, >10K+ IOPS each, 1M IOPS possibleIO Bandwidth 10GB/s easy

SAN vendors – questionable BW

PCI

PCI

PCI

PCI

MCH

RAID RAID RAID RAID

HDD

HDD

HDD

HDD

HDD

HDD

HDD

HDD

http://www.qdpma.com/Storage/Storage2013.htmlhttp://www.qdpma.com/ppt/Storage_2013.pptx

Page 8: Modern Performance - SQL Server

SANNode

2768 GBNode 1 768 GB

Switch Switch

SP A SP B

8 Gb FC

24 GB 24 GB

x4 SAS 2GB/s

SSD

x8 x8

SSD

x8 x8 x8x8

Data 5 Data 6 Data 7

Data 1 Data 2 Data 3 Data 4

Data 8

Data 9

Data 13

Data 10

Data 14

Data 11

Data 15

Data 12

Data 16

SSD 1 SSD 2 SSD 3 SSD 4

Log 1 Log 2 Log 3 Log 4

Log volume

SSD 10K 7.2K Hot Spares

Node 1 Node 2

Switch Switch

SP A SP B

8 Gbps FC or10Gbps FCOE

768 GB 768 GB

x4 SAS2GB/s

24 GB 24 GB

Main Volume

http://sqlblog.com/blogs/joe_chang/archive/2013/05/10/enterprise-storage-systems-emc-vmax.aspx http://sqlblog.com/blogs/joe_chang/archive/2013/02/25/emc-vnx2-and-vnx-future.aspx

Page 9: Modern Performance - SQL Server

Performance Past, Present, Future

• When will servers be so powerful that …– Been saying this for a long time

• Today – 10 to 100X overkill– 32-cores in 2012, 60-cores in 2014– Enough memory that IO is only sporadic– Unlimited IOPS with SSD

• What can go wrong?Today’s topic

Page 10: Modern Performance - SQL Server

Factors to Consider

SQL Tables Indexes

Query Optimizer

Statistics

Compile Parameters

Storage Engine

Hardware

DOPmemory

Page 11: Modern Performance - SQL Server

Special Topics

• Data type mismatch• Multiple Optional Search Arguments (SARG)– Function on SARG

• Parameter Sniffing versus Variables• Statistics related (big topic)• first OR, then AND/OR combinations• Complex Query with sub-expressions• Parallel Execution

Not in order of priority

http://blogs.msdn.com/b/sqlcat/archive/2013/09/09/when-to-break-down-complex-queries.aspx

Page 12: Modern Performance - SQL Server

1a. Data type mismatchDECLARE @name nvarchar(25) = N'Customer#000002760'SELECT * FROM CUSTOMER WHERE C_NAME = @name

SELECT * FROM CUSTOMER WHERE C_NAME = CONVERT(varchar, @name).NET auto-parameter discovery?

Unable to use index seek

Table column is varcharParameter/variable is nvarchar

Page 13: Modern Performance - SQL Server

1b. Type Mismatch – Row EstimateSELECT * FROM CUSTOMER WHERE C_NAME LIKE 'Customer#00000276%'SELECT * FROM CUSTOMER WHERE C_NAME LIKE N’Customer#00000276%'

Row estimate error could have severe consequences in a complex query

Page 14: Modern Performance - SQL Server

SELECT TOP + Row Estimate Error

SELECT TOP 1000 [Document].[ArtifactID] FROM [Document] (NOLOCK) WHERE [Document].[AccessControlListID_D] IN (1,1000064,1000269) AND EXISTS (   SELECT [DocumentBatch].[BatchArtifactID]   FROM [DocumentBatch] (NOLOCK)   INNER JOIN [Batch] (NOLOCK)   ON [Batch].ArtifactID = [DocumentBatch].[BatchArtifactID]   WHERE [DocumentBatch].[DocumentArtifactID] = [Document].[ArtifactID]   AND [Batch].[Name] LIKE N'%Value%' ) ORDER BY [Document].[ArtifactID]

Data type mismatch – results in estimate rows highTop clause – easy to find first 1000 rows

In fact, there are few rows that match SARGWrong plan for evaluating large number of rows

http://www.qdpma.com/CBO/Relativity.html

Page 15: Modern Performance - SQL Server

2. Multiple Optional SARGDECLARE @Orderkey int, @Partkey int = 1

SELECT * FROM LINEITEM WHERE (@Orderkey IS NULL OR L_ORDERKEY = @Orderkey) AND (@Partkey IS NULL OR L_PARTKEY = @Partkey)

AND (@Partkey IS NOT NULL OR @Orderkey IS NOT NULL)

Page 16: Modern Performance - SQL Server

IF blockDECLARE @Orderkey int, @Partkey int = 1

IF (@Orderkey IS NOT NULL) SELECT * FROM LINEITEM WHERE (L_ORDERKEY = @Orderkey) AND (@Partkey IS NULL OR L_PARTKEY = @Partkey)

ELSE IF (@Partkey IS NOT NULL) SELECT * FROM LINEITEM WHERE (L_PARTKEY = @Partkey)

Need to consider impact of Parameter Sniffing,Consider the OPTIMIZER FOR hint

These are actually the stored procedure parameters

Page 17: Modern Performance - SQL Server

Dynamically Built Parameterized SQLDECLARE @Orderkey int, @Partkey int = 1, @SQL nvarchar(500), @Param nvarchar(100)SELECT @SQL = N‘/* Comment */SELECT * FROM LINEITEM WHERE 1=1‘, @Param = N'@Orderkey int, @Partkey int'

IF (@Orderkey IS NOT NULL) SELECT @SQL = @SQL + N' AND L_ORDERKEY = @Orderkey'IF (@Partkey IS NOT NULL) SELECT @SQL = @SQL + N' AND L_PARTKEY = @Partkey'PRINT @SQLexec sp_executesql @SQL, @Param, @Orderkey, @Partkey

IF block is easier for few optionsDynamically built parameterized SQL better for many optionsConsider /*comment*/ to help identify source of SQL

Page 18: Modern Performance - SQL Server

2b. Function on column SARGSELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM WHERE YEAR(L_SHIPDATE) = 1995 AND MONTH(L_SHIPDATE) = 1

SELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM WHERE L_SHIPDATE BETWEEN '1995-01-01' AND '1995-01-31'

DECLARE @Startdate date, @Days int = 1SELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM WHERE L_SHIPDATE BETWEEN @Startdate AND DATEADD(dd,1,@Startdate)

Page 19: Modern Performance - SQL Server

Estimated versus Actual Plan - rowsEstimated Plan – 1 row???

Actual Plan – actual rows 77,356

Page 20: Modern Performance - SQL Server

3 Parameter Sniffing-- first call, procedure compiles with these parametersexec p_Report @startdate = '2011-01-01', @enddate = '2011-12-31'

-- subsequent calls, procedure executes with original planexec p_Report @startdate = '2012-01-01', @enddate = '2012-01-07'

Need different execution plans for narrow and wide rangeOptions: 1) WITH RECOMPILE2) main procedure calls 1 of 2 identical sub-proceduresOne sub-procedure is only called for narrow rangeOther called for wide range

Skewed data distributions also importantExample: Large & small customers

Assuming date data type

Page 21: Modern Performance - SQL Server

4 Statistics

• Auto-recompute points• Sampling strategy– How much to sample - theory?– Random pages versus random rows– Histogram Equal and Range Rows– Out of bounds, value does not exist– etc.

Statistics Used by the Query Optimizer in SQL Server 2008Writer: Eric N. Hanson and Yavor AngelovContributor: Lubor Kollarhttp://msdn.microsoft.com/en-us/library/dd535534.aspx

Page 22: Modern Performance - SQL Server

Statistics Structure

• Stored (mostly) in binary field Scalar values

Density Vector – limit 30, half in NC, half Cluster key

HistogramUp to 200 steps

Consider not blindly using IDENTITY on critical tablesExample: Large customers get low ID valuesSmall customers get high ID values

http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx

Page 23: Modern Performance - SQL Server

Statistics Auto/Re-Compute

• Automatically generated on query compile• Recompute at 6 rows, 500, every 20%?

Has this changed?

Page 24: Modern Performance - SQL Server

Statistics Sampling

• Sampling theory– True random sample– Sample error - square root N • Relative error 1/ N

• SQL Server sampling– Random pages • But always first and last page???

– All rows in selected pages

Page 25: Modern Performance - SQL Server

Row Estimate Problems

• Skewed data distribution• Out of bounds• Value does not exist

Page 26: Modern Performance - SQL Server

Loop Join - Table Scan on Inner Source

Estimated out from first 2 tabes (at right) is zero or 1 rows. Most efficient join to third table (without index on join column) is a loop join with scan. If row count is 2 or more, then a fullscan is performed for each row from outer source

Default statistics rules may lead to serious ETL issuesConsider custom strategy

Page 27: Modern Performance - SQL Server
Page 28: Modern Performance - SQL Server

Compile Parameter Not ExistsMain procedure has cursor around view_ServersFirst server in view_Servers is ’CAESIUM’Cursor executes sub-procedure for each Serversql:

SELECT MAX(ID) FROM TReplWS WHERE Hostname = @ServerName

But CAESIUM does not exist in TReplWS!

Page 29: Modern Performance - SQL Server

Good and Bad Plan?

Page 30: Modern Performance - SQL Server

SqlPlan Compile Parameters

Page 31: Modern Performance - SQL Server

SqlPlan Compile Parameters<?xml version="1.0" encoding="utf-8"?><ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.50.2500.0"> <BatchSequence> <Batch> <Statements> <StmtSimple StatementText="@ServerName varchar(50) SELECT @maxid = ISNULL(MAX(id),0)

FROM TReplWS WHERE Hostname = @ServerName" StatementId="1" StatementCompId="43" StatementType="SELECT" StatementSubTreeCost="0.0032843" StatementEstRows="1"StatementOptmLevel="FULL" QueryHash="0x671D2B3E17E538F1" QueryPlanHash="0xEB64FB22C47E1CF2" StatementOptmEarlyAbortReason="GoodEnoughPlanFound">

<StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="false" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false" />

<QueryPlan CachedPlanSize="16" CompileTime="1" CompileCPU="1" CompileMemory="168"> <RelOp NodeId="0" PhysicalOp="Compute Scalar" LogicalOp="Compute Scalar"

EstimateRows="1" EstimateIO="0" EstimateCPU="1e-007“ AvgRowSize="15" EstimatedTotalSubtreeCost="0.0032843" Parallel="0" EstimateRebinds="0" EstimateRewinds="0">

</RelOp> <ParameterList> <ColumnReference Column="@ServerName" ParameterCompiledValue="'CAESIUM'" /> </ParameterList> </QueryPlan> </StmtSimple> </Statements> </Batch> </BatchSequence></ShowPlanXML>

Compile parameter values at bottom of sqlplan file

Page 32: Modern Performance - SQL Server

More Plan Details

Query with joining 6 tablesEach table has too many indexesRow estimate is high – plan cost is highQuery optimizer tries really really hard to find better planActual rows is moderate, any plan works

Page 33: Modern Performance - SQL Server

5a Single Table OR-- Single tableSELECT * FROM LINEITEM WHERE L_ORDERKEY = 1 OR L_PARTKEY = 184826

Page 34: Modern Performance - SQL Server

5a Join 2 Tables, OR in SARG-- subsequent calls, procedure executes with original planSELECT O_ORDERDATE, O_ORDERKEY, L_SHIPDATE, L_QUANTITYFROM LINEITEM INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEYWHERE L_PARTKEY = 184826 OR O_CUSTKEY = 137099

Page 35: Modern Performance - SQL Server

5a UNION (ALL) instead of ORSELECT O_ORDERDATE, O_ORDERKEY, L_SHIPDATE, L_QUANTITY, O_CUSTKEY, L_PARTKEY FROM LINEITEM INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY WHERE L_PARTKEY = 184826UNION (ALL)SELECT O_ORDERDATE, O_ORDERKEY, L_SHIPDATE, L_QUANTITY, O_CUSTKEY, L_PARTKEY FROM LINEITEM INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY WHERE O_CUSTKEY = 137099 -- AND (L_PARTKEY <> 184826 OR L_PARTKEY IS NULL) --

Caution: select list should have keys to ensure correct rowsUNION removes duplicates (with Sort operation)UNION ALL does not

-- Hugo Kornelis trick --

Page 36: Modern Performance - SQL Server

5b AND/OR Combinations• Hash Join is good method to process many rows

– Requirement is equality join condition

• In complex SQL with AND/OR or IN NOT IN combinations– Query optimizer may not be to determine that equality join

condition exists– Execution plan will use loop join, – and attempt to force hash join will be rejected

• Re-write using UNION in place of OR• And LEFT JOIN in place of NOT IN

SELECT xx FROM A WHERE col1 IN (expr1) AND col2 NOT IN (expr2)SELECT xx FROM A WHERE (expr1) AND (expr2 OR expr3)

More on AND/OR combinations: http://www.qdpma.com/CBO/Relativity3.html

Page 37: Modern Performance - SQL Server

Complex Query with Sub-expression

• Query complexity – really high compile cost• Repeating sub-expressions (including CTE)– Must be evaluated multiple times

• Main Problem - Row estimate error propagation• Solution/Strategy – Get a good execution plan– Temp table when estimate is high, actual is low.

More on AND/OR combinations: http://www.qdpma.com/CBO/Relativity4.htmlhttp://blogs.msdn.com/b/sqlcat/archive/2013/09/09/when-to-break-down-complex-queries.aspx

When Estimate is low, and actual rows is high, need to balance temp table insert overhead versus plan benefit. Would a join hint work?

Page 38: Modern Performance - SQL Server

Temp Table and Table Variable

• Forget what other people have said– Most is cr@p

• Temp Tables – subject to statistics auto/re-compile• Table variable – no statistics, assumes 1 row• Question: In each specific case: does the statistics

and recompile help or not?– Yes: temp table– No: table variable

Page 39: Modern Performance - SQL Server

Parallelism

• Designed for 1998 era– Cost Threshold for Parallelism: default 5– Max Degree of Parallelism – instance level– OPTION (MAXDOP n) – query level

• Today – complex system – 32 cores– Plan cost 5 query might run in 10ms?– Some queries at DOP 4– Others at DOP 16?

More on Parallelism: http://www.qdpma.com/CBO/ParallelismComments.htmlhttp://www.qdpma.com/CBO/ParallelismOnset.html

Really need to rethink parallelism / NUMA strategies

Number of concurrently running queries x DOP less than number of logical/physical processors?

Page 40: Modern Performance - SQL Server

Full-Text Search

Loop Join with FT as inner Source Full Text search Potentially executed many times

Page 41: Modern Performance - SQL Server

varchar(max) stored in lob pages

• Disk IO to lob pages is synchronous?– Must access row to get 16 byte link?– Feature request: index pointer to lob

SQL PASS 2013Understanding Data Files at the Byte LevelMark Rasmussen

Page 42: Modern Performance - SQL Server

Summary

• Hardware today is really powerful– Storage may not be – SAN vendor disconnect

• Standard performance practice– Top resource consumers, index usage

• But also Look for serious blunders

http://www.qdpma.com/CBO/SQLServerCostBasedOptimizer.htmlhttp://www.qdpma.com/CBO/Relativity.htmlhttp://blogs.msdn.com/b/sqlcat/archive/2013/09/09/when-to-break-down-complex-queries.aspx

Page 43: Modern Performance - SQL Server

Special Topics

• Data type mismatch• Multiple Optional Search Arguments (SARG)– Function on SARG

• Parameter Sniffing versus Variables• Statistics related (big topic)• AND/OR• Complex Query with sub-expressions• Parallel Execution

Page 44: Modern Performance - SQL Server

SQL Server Edition Strategies

• Enterprise Edition – per core licensing costs– Old system strategy• 4 (or 2)-socket server, top processor, max memory

– Today: How many cores are necessary• 2 socket system, max memory (16GB DIMMs)

• Is standard edition adequate– Low cost, but many important features disabled

• BI edition – 16 cores– Limited to 64GB for SQL Server process

Page 45: Modern Performance - SQL Server

New Features in SQL Server• 2005

– Index included columns– Filtered index– CLR

• 2008– Partitioning– Compression

• 2012– Column store (non-clustered)

• 2014– Column store clustered– Hekaton