1© Copyright 2010 EMC Corporation. All rights reserved.
Best Practicesfor Microsoft SQL Server
Strategic Solutions Engineering (SSE)EMC Solutions Group (ESG)
Eyal SharonMichael Morris
2© Copyright 2010 EMC Corporation. All rights reserved.
Proven Solutions approach
Capture& Define
Test and Validate
Document Publish
Singapore Shanghai, China
Cork, Ireland
Durham, NCSanta Clara, CA
Vienna, Austria
1 2 3 4
Req
uire
men
ts
3© Copyright 2010 EMC Corporation. All rights reserved.
Tiered/Unified Storage
Replication, Backup and
Recovery
Business Continuity
Security
• Replication Manager
• NetWorker
• Avamar
• Data Domain
• RSA Data Loss Prevention Suite
• RSA SecureView
• RSA enVision
• RSA SecurID
• RSA Adaptive Authentication
• Symmetrix VMAX
• Symmetrix VMAXe
• VNX
• VNXe
• Atmos
• VPLEX
• SRDF
• RecoverPoint
• vCenter SRM
• Cluster Enabler
• VMware vSphere
• Microsoft Hyper-V
• VCE Vblock
• VPLEX
• Proven Solutions, White Papers and Best Practices • In lockstep with various EMC Business Units (ESD, USD, BRS etc.)
Solutions Overview for Microsoft Applications
Virtualization and Private
Cloud
4© Copyright 2010 EMC Corporation. All rights reserved.
Agenda
• Introduction
• The key factors for SQL Server performance
• Performance Troubleshooting
• EMC Storage Features and SQL Server
• Best Practices
• Storage I/O measuring tools
• Summary
5© Copyright 2010 EMC Corporation. All rights reserved.
SQL Server Storage DesignThe challenge
• SQL Server workloads are undeterministic
• Generic sizing tools are limited
• Analysis is complex
Recent customer examples• Pre deployment
– Poor benchmarking• 5 Disk Flash RG• SQLIO running single thread sequential writes
• Post deployment
• Poor performance of VMs• 10 * SATA Drives (100? VMs)
6© Copyright 2010 EMC Corporation. All rights reserved.
SQL Server PerformanceWhat is the best configuration for optimal performance?
• The ultimate answer any expert would give you:
IT DEPENDS!!
• Server Hardware/VM specs and configuration
• SQL Server configuration
• I/O Characteristics
• Storage configuration
7© Copyright 2010 EMC Corporation. All rights reserved.
Server
• Processor– More, faster cores per processor (6 -> 8 -> 10) – Turbo-Boost helps too!– Important for single-threaded OLTP query
performance– Intel IvyBridge-EP will bring significantly better
memory and I/O bandwidth in Q1 2012– Larger L2 and L3 caches
• Significant for database performance
• Memory– Significantly cheaper as a commodity today– Extended memory sizes increases database
performance– Larger memory reduces physical I/O
8© Copyright 2010 EMC Corporation. All rights reserved.
Server Virtualization• Properly manage resources allocation for SQL
VMs– For Tier 1
• Maintain a 1:1 ratio of physical cores to vCPUs• Do not over commit memory
– For lower-tiers• Decide on factor of over-commitment of physical
resources
• Keep NUMA node size in mind with sizing virtual machines
• Disable hyper-threading for DW, consider enabling for OLTP (a hyper-thread does not provide the full power of a physical core)
9© Copyright 2010 EMC Corporation. All rights reserved.
In-VM Clustering
RDM iSCSI LUN
Hypervisor Other... Hyper-V
VMVM
G:\ G:\
IPLSI SCSI Adapter
iSC
SI
Initia
tor
Pros * Application-aware clustering* Faster failover times than Hypervisor’s High Availability* Ability to have less disruptive patching
Cons * Added complexity of storage enumeration, 1:1 LUN requirements* Potentially slower performance (LSI adapter/iSCSI)* Requires Virtualization anti-affinity rules* No Non-disruptive migration using Hypervisor. Cluster failover is disruptive.
10© Copyright 2010 EMC Corporation. All rights reserved.
5000
10000
15000
20000
25000
30000
6554 65597523
11413 1178112625
19796
2127322062
7881 78838661
12226
15744 16247
25784
29135 29409Pool 1: WSFC
2 hour Baseline 4 hour FASTVP relocation 1.5 hours Fast Cache
SQL Server VirtualizationReference Architecture – Performance
• Storage Profile– FAST Cache: 12*100 GB Flash Drives (6*mirrors)
– Two Pools: 40*SAS + 5*EFD (FAST VP)
11© Copyright 2010 EMC Corporation. All rights reserved.
SQL ServerKnow Your Workload!
Generalizing SQL Server I/O patterns is difficult - sizing storage for unknown workload is not trivial
• OLTP (Online Transaction Processing)– Typical heavy on random read / writes (8-64K)
– Queries with many seek operations (Select small no. of rows at a time)
– Measured in IOPS
• DSS/DW– Typical 64KB+ sequential reads
– Scan intensive operations (large portion of data - table and range scan)
– 128-256KB sequential writes (bulk load)
– Measured in MB/s
• Operational Activities – Backup/Restore , Index rebuild etc.
• In reality, “mixed” workloads are more common– SAP, SharePoint etc.
12© Copyright 2010 EMC Corporation. All rights reserved.
SQL ServerPerfromance Troubleshooting – C-A-R-T
Collect•SQL DMV stats•Perfmon data•Hypervisor data•Nar/STP•Symptoms
Analyze•Disk Utilization•Disk Latencies•Read/Write ratio•Disk IOPs
•Workload Type•I/O Average•I/O Peaks•DB efficiency
Recommend•DB Design•DB Layout•Storage Design•Disk Type•RAID Type•Tiering•Protection
Test
13© Copyright 2010 EMC Corporation. All rights reserved.
•Keep CPU utilization < 65%
•Optimize Max Memory, leaving some for O/S
•Lock Pages in Memory
•Perform regular Database Index Maintenance
SQL Server
•64k Alignment and Block Sizes
•Pre-allocate Log Files, Free space considerations
•Use “Perform volume Maintenance tasks” privilege
•Use same-performance LUNs across FG volumes
Volumes & Files
•SQL vCPU and Memory reservations
•Virtual Volume or Physical Volume
•Co-located Application Workloads on virtual volumes
•Plan for Recovery
Hypervisor
•Assured bandwidth
•Multi-pathing
Connectivity
•Disks, RAID, FAST
•Buses, SPs, Cache
Storage
14© Copyright 2010 EMC Corporation. All rights reserved.
Counter Description
Disk Reads/sec Disk Writes/sec
Measures the number of IOPs
Average Disk sec/ Read Average Disk sec/ Write
Measures disk latency. : 1 - 5 milliseconds (ms) for Log (ideally 1 ms or less on average) 5 - 20 ms for Database Files (OLTP) (Ideally 10 ms or less on average) Less than or equal to 25-30 ms for Data (DSS/DW)
Average Disk Bytes/Read Average Disk Bytes/Write
Measures the size of I/Os. Larger I/Os tend to have higher latency (for example, BACKUP/RESTORE operations issue 1 MB transfers by default).
Current Disk Queue Length
Displays the number of outstanding I/Os waiting to be read or written from the disk. High queue depths + high latencies = performance problem! High queue depth + low latencies = active and efficient system.
Disk Read Bytes/sec Disk Write Bytes/sec
Measures total disk throughput (bandwidth)
SQL Server Buffer Manager Perfmon Object
Measured at the SQL Server instance level and useful in determining the ratio of scan type/seek activity
Checkpoint pages/secMeasures the number of 8K database pages/s written to database files in a checkpoint operation.
Page Reads/sec Measures the number of physical page reads per second
Readahead pages/secMeasures the number of physical page reads performed using SQL read-ahead mechanism. (Used by SQL for scan activity - common for DSS/DW workloads). Can vary in size (8-512K). Useful in determining I/Os generated by scans as opposed to seeks in mixed workload environments.
15© Copyright 2010 EMC Corporation. All rights reserved.
SQL Server Best PracticesFILEGROUPS
• More data files does not necessarily provide better performance – Determined mainly by hardware capacity & characteristics of
access patterns
– Data files can be used to maximize # of spindles – striping
– Number of data files per FILEGROUP– Traditionally in the range of .25 to 1 per CPU cores but is it really the case
with 24 cores?
• Best practices: – Pre-size data/log files
– Use equal size for files within a single FILEGROUP
– Rely on AUTOGROW as little as possible
16© Copyright 2010 EMC Corporation. All rights reserved.
SQL Server Best Practices tempdb Considerations
• tempdb placement (dedicated vs. shared spindles)– In many scenarios it may be ok to place tempdb on common spindles with
data files utilizing pools
– Depends on how well you know your workload use of tempdb
• Sizing tempdb – Application time-outs that may occur during autogrow operations,
preallocate space to allow for the expected workload
– More details (Optimizing tempdb performance):– http://msdn.microsoft.com/en-us/library/ms175527.aspx
• 1 data file per CPU core???– Applies most to allocation intensive workloads with heavy tempdb
utilization
– Same practices as data files with respect to sizing and growth
17© Copyright 2010 EMC Corporation. All rights reserved.
SQL Server Best PracticesLog files
• Log manager activity is sequential in nature
• Disk response time is key – Logical Disk Counters: Avg. Disk/sec Write
– SQL Server Databases: (Log Flush Wait Time)/(Log Flushes/sec)
– Consider placing transaction log files on a RAID1/0 group/pool for lower write latency and faster rebuilds
18© Copyright 2010 EMC Corporation. All rights reserved.
Validate Storage ConfigurationsTools
• SQLIO– Use: Test throughput of I/O subsystem or establish benchmark of I/O subsystem
performance• http://www.microsoft.com/downloads/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442
d9e19&displaylang=en
• IOMeter– Use: Test throughput of I/O subsystem or establish benchmark of I/O subsystem
performance
– Open source tool, Allows combinations of I/O types to run concurrently against test file
– No support for mount point volumes • http://www.iometer.org/
• SQLIOSim– Use: Test the I/O stability of a storage subsystem
– Simulates various patterns of SQL Server I/O.
– Don't consider SQLIOSim for performance benchmarking, use SQLIO instead• http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/06/SQLIOSim-available-for-down
load.aspx
• Microsoft TPC benchmark kit
• Quest Benchmark Factory
19© Copyright 2010 EMC Corporation. All rights reserved.
Sizing exercise:Rough-Order-of MagnitudeDrive type IOPS
Flash drive 3,500
SAS 15K rpm 180
SAS 10K rpm 150
NL-SAS 7.2K rpm 90
FAST CacheExpected to
service6 9,000 16,990
Expected Backend Disk IOPs = Host Read IOPs + 4 * Host Write IOPs
Host IOPs 25,000
Read / Write Ratio 9:1RAID Type RAID
5
IOPs = (0.9 * 25,000) + 4 * ( 0.1 * 25,000) = 32,500
FAST VP PoolExpected to
service45 23,500
Drive Type
Flash Drives
SAS 10k rpm
No of Disks
5
40
Drive IOPs
17,500
6,000
66% Drive IOPs
11,550
3,960
15,510
20© Copyright 2010 EMC Corporation. All rights reserved.
20 Disks (0-19) 5 Disk (20-24)
SAS (Pool 1 -Tier 1) RG 4T-Logs
4+14+14+14+1 2+2
20 Disks (0-19) 5 Disk (20-24)
SAS (Pool 1 -Tier 1) RG 1 (4+1)OS Volumes
4+14+14+14+1 4+1
FLARE Drives
1 2 3 4 5 6 7 8 9 10 11 12 13 140
FLASH DRIVES(Pool 1 – Tier 0)
FAST Cache (3 Mirrors)4+1 1mirror 1mirror 1 mirror
System DBs + Quorum
1 2 3 4 5 6 7 8 9 10 11 12 13 140
FLASH DRIVES (Pool 2 – Tier 0)
FAST Cache (3 Mirrors)4+1 1mirror 1mirror 1 mirror
HS Empty
20 Disks (0-19) 5 Disk (20-24)
SAS (Pool 2 – Tier 1)RG2
TempDB
4+14+14+14+1 HS
RG3TempD
B
2+2 2+2
20 Disks (0-19) 5 Disk (20-24)
SAS (Pool 2 – Tier 1)4+14+14+14+1 H
S
RG2TempD
B
RG3TempD
B
2+22+2
2.0
1.1
1.0
0.1
3.0
3.1
HS
SA
S B
ack e
nd
Bu
s 0
& 1
SA
S B
ack e
nd
Bu
s 2
& 3
SAS 10k rpm
Flash Drives
Legend
HS -Hot Spare
LUN 201
100%on SAS
LUN 202
100%on SAS
LUN 203
100%on SAS
LUN 204
100%on SAS
LUN 201
20.21%on Flash
LUN 202
22.03%on Flash
LUN 201
79.9%on SAS
LUN 202
77.97%on SAS
21© Copyright 2010 EMC Corporation. All rights reserved.
1 2 3 4 5 6 7 8 90
5,000
10,000
15,000
20,000
25,000
30,000
7,8817,883 8,661
12,226
15,74416,247
25,784
29,135 29,409Transfer/sec (IOPs)
Performance
Baseline FAST VP relocation Fast Cache
2 hours 4 hours 2 hours
1 2 3 4 5 6 7 8 90
1,000
2,000
3,000
4,000
5,000
1,157 1,154 1,246
1,804
2,354 2,434
3,915
4,446 4,484
Transactions/sec (TPS)
22© Copyright 2010 EMC Corporation. All rights reserved.
1 2 3 4 5 6 7 8 90%
25%
50%
75%
100%
91% 91% 90% 90%85% 82%
19% 19% 20%21%
54%67% 64%
76%82% 82%
Physical Disk Utilization (%)
Performance
Baseline FAST VP relocation Fast Cache
2 hours 4 hours 2 hours
1 2 3 4 5 6 7 8 90%
25%
50%
75%
100%
20% 20%32%
41% 41%34%
45% 48% 48%
Storage Processor Utilization (%)
23© Copyright 2010 EMC Corporation. All rights reserved.
Summary• The key factors for a successful SQL Server design:
– Server/VM configuration– SQL Server configuration – I/O Characterization– Storage configuration
• Know your workload– Recommend the right technologies based on the workload– Guessing might also work, especially when over-architecting
• Performance troubleshooting– Collect, Analyze and Recommend accordingly
• Leverage best practices based on the use case– There’s no set of best practices that fits all!– Use EMC Proven Solutions as a reference– Read/Contribute to Everything Microsoft at EMC
24© Copyright 2010 EMC Corporation. All rights reserved.
THANK YOU