19
Accelerating Database Performance Using Compression Joey D’Antoni New Jersey SQL Server Users Group 19 March 2013

Accelerating Database Performance with Compression

  • Upload
    jdanton

  • View
    916

  • Download
    0

Embed Size (px)

DESCRIPTION

Presentation about the potential benefits and costs of using data compression within SQL Server. Presented to NJ SQL Group 19 March 2012

Citation preview

Page 1: Accelerating Database Performance with Compression

Accelerating Database Performance Using CompressionJoey D’Antoni

New Jersey SQL Server Users Group

19 March 2013

Page 2: Accelerating Database Performance with Compression

About Me

Principal Architect SQL Server, Comcast Cable

@jdanton – Twitter Joedantoni.wordpress.com [email protected]

Page 3: Accelerating Database Performance with Compression

Overview

Compression—What Does It Really Mean?

Deduplication—What is That?What Should I Compress?Columnstore—How Is It Different?

Page 4: Accelerating Database Performance with Compression

Compression

Page 5: Accelerating Database Performance with Compression

Deduplication

Specialized compression to eliminate duplicate copies of repeating data

Real Example—In VMWare, you may have 10 copies of Windows running on same physical machine. Memory blocks (Common .dlls for example) may be deduplicated.

Backups

Page 6: Accelerating Database Performance with Compression

Benefits of Compression

Faster performance on selectsLess I/O is required to return data

Better Space Utilization on Disk

More Pages In Memory

Page 7: Accelerating Database Performance with Compression

Expenses of Compression

Added CPU CyclesSlower bulk inserts and updates

Page 8: Accelerating Database Performance with Compression

SQL Server Compression Types

Row CompressionPage Compression

Prefix CompressionDictionary Compression

Backup Compression

Page 9: Accelerating Database Performance with Compression

Backup Compression

In all editions of SQL Server, starting with 2008 R2

Always use Backup Compression (even when your storage team says no)

Space is by default pre-allocated for estimated size of uncompressed backup

Trace Flag 3042

Page 10: Accelerating Database Performance with Compression

So What To Compress

SQL Data Compression gives a great deal of flexibility

Can compress tables, indexes and/or partitioning

Can use different methods of compression

How to Decide?

Page 11: Accelerating Database Performance with Compression

Space Savings

sp_estimate_data_compression_savings

What Won’t Compress Well Columns with numeric or fixed-length character data types

where most values require all the bytes allocated for the specific data type

Not much repeating data

Repeating data with non-repeating prefixes

Data stored out of the row

FILESTREAM data

Page 12: Accelerating Database Performance with Compression

Application Workloads

Microsoft advises using for Row Compression for EVERYTHING if you can spare 10% CPU. Don’t do this!

Page Compression has higher overhead

Page 13: Accelerating Database Performance with Compression

Example

-source Microsoft Compression White

Paper

Table Savings ROW %

Savings PAGE % Scans Updates Decision Notes

T1 80% 90% 3.80% 57.27% ROW Low S, very high U. ROW savings close to PAGE

T2 15% 89% 92.46% 0% PAGE Very high S

T3 30% 81% 27.14% 4.17% ROW Low S

T4 38% 83% 89.16% 10.54% ROW High U

T5 21% 87% 0.00% 0% PAGE Append ONLY table

T6 28% 87% 87.54% 0% PAGE High S, low U

T7 29% 88% 0.50% 0% PAGE 99% appends

T8 30% 90% 11.44% 0.06% PAGE 85% appends

T9 84% 92% 0.02% 0.00% ROW ROW savings ~= PAGE

T10 15% 89% 100.00%

0.00% PAGE Read ONLY table

Table 1: Deciding what to compress

Page 14: Accelerating Database Performance with Compression

What Happens with Compression

Tables and Indexes are rebuilt using ALTER TABLE…REBUILD and ALTER INDEX..REBUILD

Requires workspace, CPU and I/O

Same mechanism as rebuilding an index

Free workspace required in User Database Transaction Log Temp B

Page 15: Accelerating Database Performance with Compression

How and When Compress Data

Online vs OfflineConcurrent vs SerialOrder of Compressiong—start small and work up

SORT_IN_TEMPDB

Page 16: Accelerating Database Performance with Compression

Managing Inserts and Updates

Table organization            Table compression setting

ROW PAGE

Heap The newly inserted row is row-compressed.

The newly inserted row is page-compressed:·         if new row goes to an existing page with page compression·         if the new row is inserted through BULK INSERT with TABLOCK·         if the new row is inserted through INSERT INTO ... (TABLOCK) SELECT ... FROMOtherwise, the row is row-compressed.*

Clustered index The newly inserted row is row-compressed.

The newly inserted row is page-compressed if new row goes to an existing page with page compression Otherwise, it is row compressed until the page fills up. Page compression is attempted before a page split.**

Page 17: Accelerating Database Performance with Compression

Underlying Data St

Table compression

Transaction log

Mapping index for rebuilding the clustered index

Sort pages for queries

Version store (with SI or RCSI isolation level)

ROW ROW NONE NONE ROW

PAGE ROW NONE NONE ROW

Page 18: Accelerating Database Performance with Compression

What is Columnstore?

Page 19: Accelerating Database Performance with Compression

ColumnStore Limiations

Non Updateable

Limited Data Types

Can only be nonclustered index

No computed columns

No sparse columns

No indexed views

One Per Table