23
1 © Copyright 2009 EMC Corporation. All rights reserved. Data Warehousing Features in SQL Server 2008 James Rowland-Jones @jrowlandjones

Data Warehousing Features in SQL Server 2008

  • Upload
    senwe

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Warehousing Features in SQL Server 2008. James Rowland-Jones @ jrowlandjones. Official DW Feature Set in SQL 2008. JRJ’s DW Feature Set in SQL 2008. What We’ll Focus On. Data Compression. Enterprise Edition Only Row and Page Compression - PowerPoint PPT Presentation

Citation preview

Page 1: Data Warehousing Features in SQL Server 2008

1© Copyright 2009 EMC Corporation. All rights reserved.

Data Warehousing Features in SQL Server 2008

James Rowland-Jones@jrowlandjones

Page 2: Data Warehousing Features in SQL Server 2008

2© Copyright 2009 EMC Corporation. All rights reserved.

Official DW Feature Set in SQL 2008

  Build Manage Deliver InsightSQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT

Backup compression

Star join performanceFaster parallel query on partitioned tablesGROUPING SETS

Resource governorData compressionPartition-aligned indexed views

Integration Services

Lookup performancePipeline performance

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Write-back Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

Page 3: Data Warehousing Features in SQL Server 2008

3© Copyright 2009 EMC Corporation. All rights reserved.

JRJ’s DW Feature Set in SQL 2008

  Build Manage Deliver InsightSQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types

Backup compression

Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART

Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes

Integration Services

Lookup performancePipeline performanceData Profiling Task

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Writeback Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

Page 4: Data Warehousing Features in SQL Server 2008

4© Copyright 2009 EMC Corporation. All rights reserved.

What We’ll Focus On

  Build Manage Deliver InsightSQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types

Backup compression

Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART

Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes

Integration Services

Lookup performancePipeline performanceData Profiling Task

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Writeback Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

Page 5: Data Warehousing Features in SQL Server 2008

5© Copyright 2009 EMC Corporation. All rights reserved.

Data Compression

Enterprise Edition Only Row and Page Compression

Compression Ratio 2 to 1 or 3 to 1 - 50% to 70% reduction in data

Can be for a table, index or a subset of their partitions Estimate savings: exec sp_estimate_data_compression_savings Max row size plus compression overhead must not exceed 8060 bytes

Page 6: Data Warehousing Features in SQL Server 2008

6© Copyright 2009 EMC Corporation. All rights reserved.

Compression Alert

CompressedTable

UNCOMPRESSED TEXT

Page 7: Data Warehousing Features in SQL Server 2008

7© Copyright 2009 EMC Corporation. All rights reserved.

Monitoring Compression

SQL Server, Access Methods Object Page compression attempts/sec Pages compressed/sec

Compression Statistics for individual Partitions Dynamic Management Function sys.dm_db_index_operational_stats

Page 8: Data Warehousing Features in SQL Server 2008

8© Copyright 2009 EMC Corporation. All rights reserved.

DEMO TIME

Resource Governor (Quickly)Data Compression

Page 9: Data Warehousing Features in SQL Server 2008

9© Copyright 2009 EMC Corporation. All rights reserved.

P & P

Partitioning Parallelism

Page 10: Data Warehousing Features in SQL Server 2008

10© Copyright 2009 EMC Corporation. All rights reserved.

Partitioning & Parallelism

Partition Table Parallelism Few Outer Rows Parallelism Partition-Aligned Indexed Views

SQL 2005 behaviour – needs to be dropped before switch Switch Partition Pulls across indexed view

Rebuild index partition

Page 11: Data Warehousing Features in SQL Server 2008

11© Copyright 2009 EMC Corporation. All rights reserved.

What is a Partitioned Table?

P1 P4P3P2

SELECT

SUM(Sales_Qty) as Sales_Qty,

SUM(Sale_Amt) as Sales_Amount

FROM SalesDB.dbo.Tbl_Fact_Sales

WHERE date_id between '20050703' and '20050716'

Page 12: Data Warehousing Features in SQL Server 2008

12© Copyright 2009 EMC Corporation. All rights reserved.

The “Problem” in SQL 2005

Rows Executes StmtText1 1 SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales]

WHERE [date_id]>=@1 AND [date_id]<=@2

0 0 |--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END))

1 1 |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013])))

2 1 |--Parallelism(Gather Streams)2 12 |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty]

as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt])))

20577235 12 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006]))

2 12 |--Parallelism(Distribute Streams, Demand Partitioning)2 1 | |--Constant Scan(VALUES:(((80)),((81))))

20577235 2 |--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))

Page 13: Data Warehousing Features in SQL Server 2008

13© Copyright 2009 EMC Corporation. All rights reserved.

Partitioning & Parallelism Compared

P1 P4P3P2P2

P1 P4P3P2P2

SQL Server 2005

SQL Server 2008

Page 14: Data Warehousing Features in SQL Server 2008

14© Copyright 2009 EMC Corporation. All rights reserved.

Work Around for SQL Server 2005

Partition 4

Partition 3

UNION

SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050703' and '20050709'

SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050710' and '20050716'

Page 15: Data Warehousing Features in SQL Server 2008

15© Copyright 2009 EMC Corporation. All rights reserved.

Few Outer Rows Parallelism

SQL 2005 One thread given per page of rows on a nested loop join

SQL 2008 One thread given per row on a nested loop join

Good for Joins to Date Dim M$ internal DW Scale Benchmark perf increase by 30%

SELECT d.Date_Desc ,SUM(f.Sale_Amt*f.Sales_Qty)

FROM Tbl_Fact_Store_Sales fJOIN Tbl_Dim_Date dON f.sk_date_id = d.sk_date_id WHERE d.date_value between '10/1/2004' and '10/7/2004'GROUP BY d.Date_Desc

Page 16: Data Warehousing Features in SQL Server 2008

16© Copyright 2009 EMC Corporation. All rights reserved.

Work-Around’s for SQL Server 2005

STUFF YOUR ROW Add a JUNK Col on the Date dimension to force one row

per page

CLUSTER ON A GUID Add a column and populate with GUIDs to encourage

Rows onto separate pages

Page 17: Data Warehousing Features in SQL Server 2008

17© Copyright 2009 EMC Corporation. All rights reserved.

Partition Aligned Indexed Views

The Big Chore was “Sliding” a table with an indexed view on it.

In 2005 this needed to be dropped In 2008 it does not

Page 18: Data Warehousing Features in SQL Server 2008

18© Copyright 2009 EMC Corporation. All rights reserved.

IT’S DEMO TIME

Sliding Window with Indexed View in PlaceRebuild Partitioned IndexFiltered Indexes

Page 19: Data Warehousing Features in SQL Server 2008

19© Copyright 2009 EMC Corporation. All rights reserved.

STAR JOINS “Optimized” Bitmap Filters

What is a Bitmap filter– In memory structure (no index overhead)– Created dynamically– Typically quite small in size

Bitmap Filter SQL 2005– What it was in 2005...– Hash or Merge JOIN

Optimised Bitmap Filter SQL 2008– Enterprise Edition– Parallel Query– Hash JOIN only– Fact table must have > 100 pages– Single Column join (No PK FK relationship requirement)(integer needed for

optimized)– Dimension input cardinalities are smaller than fact input cardinalities– Look for Bitmap warning event for missed opportunities to use Bitmap

Page 20: Data Warehousing Features in SQL Server 2008

20© Copyright 2009 EMC Corporation. All rights reserved.

Minimally Logged Inserts & TF 610

Page 21: Data Warehousing Features in SQL Server 2008

21© Copyright 2009 EMC Corporation. All rights reserved.

Bulk Load Methods Compared

Page 22: Data Warehousing Features in SQL Server 2008

22© Copyright 2009 EMC Corporation. All rights reserved.

FOR THE FINAL TIME

STAR JOINSMinimally Logged INSERTS

Page 23: Data Warehousing Features in SQL Server 2008