31
High Performance and High Availability with SQL Server 2012 AlwaysOn Sumeet Bansal, Fusion-IO Kevin Kline, SQL Sentry

SQLintersection keynote a tale of two teams

Embed Size (px)

DESCRIPTION

Shared the stage with Kevin Kline. Paul Randal and Kimberly L. Tripp organized an excellent conference. This slide deck talks about how to design large MS SQL Server architectures with 1000s of databases that are high performance and yet easy to manage. ioMemory by Fusion-io provides performance and SQL Sentry provides an amazing interface to manage and monitor 1000s of databases.

Citation preview

Page 1: SQLintersection keynote a tale of two teams

High Performance and High Availability with

SQL Server 2012 AlwaysOnSumeet Bansal, Fusion-IO

Kevin Kline, SQL Sentry

Page 2: SQLintersection keynote a tale of two teams

2© SQLintersection. All rights reserved.

http://www.SQLintersection.com

IntroductionA Tale of Two Teams

Two rival teams …… each working to satisfy an important customer.

What’s the hardware solution? What’s the software solution?

Page 3: SQLintersection keynote a tale of two teams

3© SQLintersection. All rights reserved.

http://www.SQLintersection.com

The Customer

A real-world major financial institution headquartered in London. A core banking application - credit card transactions from ATM and

Branches Requirement: 10,000 Business Transactions / sec (Not IOPs!) Highly available using AlwaysOn across hundreds of nodes in many

Availability Groups

... AND IT LOOKS LIKE THIS AT LOAD...

o

Page 4: SQLintersection keynote a tale of two teams

4© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Meeting the Requirements: Hardware

High Performance on SQL Server means tuning the FULL STACK.

Key Takeaway: This is NOT going to be easy…

OS

SQL

CPU

HBA

NIC

Array

Cache

Spindles

Page 5: SQLintersection keynote a tale of two teams

5© SQLintersection. All rights reserved.

http://www.SQLintersection.com

First Surprise - Memory

At scale, SQL Server does a generally good job of memory management by default.

Some improvements are possible on large CPU/Memory boxes dedicated to SQL Server:

Lock Pages in Memory Big performance gain! Use gpedit.msc to grant it to SQL Service account

Large page Allocations (-TF834) On Windows 2008R2 previous issues with this TF are fixed Around 10% throughput increase

NUMA node memory distribution: Beware! Set max memory close to box max if dedicated box

available

o

Page 6: SQLintersection keynote a tale of two teams

6© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Second Surprise - NICs

At scale, network traffic will generate a LOT of interrupts for the CPU

These must be handled by CPU Cores Must distribute packets to cores for processing

Rule of thumb (OTLP): 1 NIC / 16 Cores Watch the DPC activity in Taskmanager Remove SQL Server (using affinity masking) from the NIC

cores

o

Page 7: SQLintersection keynote a tale of two teams

7© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Drive Selection - General Number of files matter for SQL Server

TempDB and user database has multiple files, on segregated arrays

Other important configs: NTFS allocation size at 64-KB; HBA queue depth at 64; Storport HBA

Driver

Number of drives matter More drives = more speed True for both SAN and DAS ... Less so for SSD, but still relevant (especially for NAND)

If designing for performance, make sure the topology can handle it!

Understand the path to the drives Consider workload: Random or Sequential?

Key Takeaway: Validate and compare configurations prior to deployment

Page 8: SQLintersection keynote a tale of two teams

8© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Rules of Thumb – Disk IO Traditional Spindle

throughput 10K RPM – 100-130 IOPs, ‘full

stroke’ 15K RPM – 150-180 IOPs, ‘full

stroke’ Can achieve 2x or more when

‘short stroking’ the disks (using less than 20% capacity of the physical spindle)

These are for random 8K I/O Aggregate throughput

when sequential access: Between 90MB/sec and

125MB/sec for a single drive If truly sequential, any block

size over 8K will give you these numbers

Some 3.5” drives slightly faster than 2.5”

Approximate latency: 3-5ms

Cable speed Theoretical: 1.5GB/sec Typical: 1.2GB/sec

PCI-e v2 Bus X4 slot: 1.5 – 1.8GB/sec X8 slot: 3GB/sec

HBA speed 4Gbit – ~500MB/sec 8Gbit – ~1GB/sec on PCI-e X4

v2 bus Typical: 350-400MB/sec on

4Gbit, doubled on 8Gbit

u

Page 9: SQLintersection keynote a tale of two teams

9© SQLintersection. All rights reserved.

http://www.SQLintersection.com

What’s Causing these Non-Disk Bottlenecks?

Added disk pair

Backplane limit

140

140

110

Added controller

Page 10: SQLintersection keynote a tale of two teams

10© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Understand the Full Stack to the drives

Key Takeaway: The deeper the topology, the greater latency, the more important the tuning

Best Practices: Understand topology, potential bottlenecks and

theoretical throughput of each component in the path! Engage storage engineers early in the process

Two major topologies for SQL Server Storage

DAS – Direct Attached Storage SAN – Storage Area Networks

Page 11: SQLintersection keynote a tale of two teams

11© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Traditional Centralized Architecture

11

Application CPU and Memory HBA Switches Target

AdaptersCPU and Memory

RAID Controllers HDD/SSD

SERVERS

Active and Archive Data

STORAGE (Performance Optimized)NETWORK

Milliseconds

DatabasesVirtualizationWeb-scale

Latency and Processing Time

Page 12: SQLintersection keynote a tale of two teams

12© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Shared Data Decentralization

12

Active Data

Application CPUs NAND Flash Raid Controller HDD/SSD

SERVERS

Latency and Processing Time

Archive Data

Microseconds Milliseconds

DatabasesVirtualizationWeb-scale

Page 13: SQLintersection keynote a tale of two teams

13© SQLintersection. All rights reserved.

http://www.SQLintersection.com

The SAN – Panacea to All IO Issues…

….YEAH RIGHT!

Green: Checkpoint, Red: tx/sec, Black: Disk Latency

o

Page 14: SQLintersection keynote a tale of two teams

14© SQLintersection. All rights reserved.

http://www.SQLintersection.com

DAS vs. SAN - Summary

Feature SAN DASCost High, offset by better

utilizationLow, may waste space

Flexibility More, abstraction allows online configuration changes

Less, get it right the first time!

Skills required Complex with steep learning curve

Simple and well understood

Additional Features

Snapshots; Storage Replication; Thin Provisioning

None

Performance Not high performance technology

High performance for small investment

Reliability More, very high reliability Less, depending on RAID level

Clustering Support

Yes No (special implementations exist)

So, which should we choose?

SAN DAS

o

Page 15: SQLintersection keynote a tale of two teams

15© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Let’s See What It Can Do!

1 x MS SQL Server 25 Billion Transactions/Day

(Equivalent to the number of estimated Credit card transactions around the globe in a single day)

http://www.fusionio.com/blog/powering-global-commerce-with-sql-server-iomemory/

4 x 1.2TB

Page 16: SQLintersection keynote a tale of two teams

Demo

Turn difficult disk IO tuning into easyioMemory plug-n-play.

Page 17: SQLintersection keynote a tale of two teams

17© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Meeting the Requirements: Software

Highly transparent instrumentation means monitoring the FULL STACK.

Key Takeaway: This is NOT going to be easy…

OS

SQL

CPU

HBA

NIC

Array

Cache

Spindles

ov

Page 18: SQLintersection keynote a tale of two teams

18© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Instrumentation: PerfMon

Throughput: Measured in MB/sec or IOPs by PerfMon: Logical Disk Disk Read Bytes / Sec Disk Write Bytes / Sec Disk Read / Sec Disk Writes / Sec

Latency: Measured in milliseconds (ms) by PerfMon: Logical Disk Avg. Disk Sec / read Avg. Disk Sec / write

More on healthy latency values later Key Takeway: For transparency, PerfMon gives a limited picture of

performance.

o

Page 19: SQLintersection keynote a tale of two teams

19© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Instrumentation: Profiler / Trace

High overhead Lots of experience needed to filter the results Deprecated! (But only for relational engine).

Key Takeway: Shows triggered events, but not a comprehensive view of whole system. Not a reliable long-term solution.

o

Page 20: SQLintersection keynote a tale of two teams

20© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Instrumentation: DMVs

-- SQL Server 2012 Diagnostic Information Queries, by Glenn Berry, @GlennAlanBerry

-- http://sqlserverperformance.wordpress.com/-- http://sqlskills.com/blogs/glenn/

-- Get total buffer used by DB for current instance

SELECT DB_NAME(database_id) AS [Database Name], COUNT(*) * 8/1024.0 AS [Cached Size (MB)]FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)WHERE database_id > 4 -- system databases AND database_id <> 32767 -- ResourceDBGROUP BY DB_NAME(database_id)ORDER BY [Cached Size (MB)] DESC OPTION

(RECOMPILE);

Great information! Built in for SQL Server 2005+. No history. No correlation. No interpretation.

Key Takeway: Very useful. Not very useable.

o

Page 21: SQLintersection keynote a tale of two teams

21© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Instrumentation: Extended Events

Low overhead Lots of experience needed to filter the results How much memory or space? Other administrative questions to

answer…

Key Takeway: Deep data, but is it actionable and proactive information?

o

Page 22: SQLintersection keynote a tale of two teams

22© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Instrumentation: Notifications

Per server setup Requires SQLAgent service Can only capture error msg/lvl, WMI metrics, PerfMon metrics

Key Takeway: Alerts are available, but high support requirements and limited proactivity.

o

Page 23: SQLintersection keynote a tale of two teams

Demo

Bringing all the instrumentation together for meaningful, actionable performance information.

ow

Page 24: SQLintersection keynote a tale of two teams

24© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Meeting the Requirements: HA

Need more flexibility than in legacy approaches like log shipping and database mirroring.

Need a shared nothing architecture.

Key Takeaway: This is not too bad UNTIL we scale up …

OS

SQL

CPU

HBA

NIC

Array

Cache

Spindles

Page 25: SQLintersection keynote a tale of two teams

25© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Availability Groups Fundamentals

o

Page 26: SQLintersection keynote a tale of two teams

26© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Special Considerations: AlwaysOn

Granular control and some visibility into AlwaysOn through SSms Rt-Click Show Dashboard.

Designed for small scale implementations. As with earlier tools, user carries the risk and requirement for expertise.

o

Page 27: SQLintersection keynote a tale of two teams

Demo

HA + DR management and monitoring at scale.

ox

Page 28: SQLintersection keynote a tale of two teams

28© SQLintersection. All rights reserved.

http://www.SQLintersection.com

How Did We Do It?OS

SQL

CPU

HBA

NIC

Array

Cache

Spindles

OS +SQL

CPU

Fusion-io + SQL Sentry

Page 29: SQLintersection keynote a tale of two teams

29© SQLintersection. All rights reserved.

http://www.SQLintersection.com

References

Thomas Kejser, SQLCAT, and high performance IO tuning: http://blog.kejser.org/tag/sqlcat/ http://blog.kejser.org/

Jonathan Kehayias & xEvents: http://www.sqlskills.com/blogs/jonathan/category/extended-eve

nts/

Joe Sack & AlwaysOn: http://www.sqlskills.com/blogs/joe/answering-questions-with-th

e-alwayson-dashboard/

SQLPerformance.com (Jonathan Kehayias) instrumentation overhead analysis:

http://www.sqlperformance.com/2012/10/sql-trace/observer-overhead-trace-extended-events

Page 30: SQLintersection keynote a tale of two teams

30© SQLintersection. All rights reserved.

http://www.SQLintersection.com

Review

High performance IO is very hard when restricted to disk-only architectures.

ioMemory from Fusion-IO is the solution! Highly transparent monitoring and alerting,

especially for HA, is very hard with native tools and features.

Performance Advisor from SQL Sentry is the solution!

Visit our booths to see the latest releases and sign up for free trials and demonstrations!

Page 31: SQLintersection keynote a tale of two teams

Don’t forget to enter your evaluation of this session using EventBoard!

Questions?

Thank you!