View
33
Download
0
Category
Preview:
DESCRIPTION
Data Warehousing Features in SQL Server 2008. James Rowland-Jones @ jrowlandjones. Official DW Feature Set in SQL 2008. JRJ’s DW Feature Set in SQL 2008. What We’ll Focus On. Data Compression. Enterprise Edition Only Row and Page Compression - PowerPoint PPT Presentation
Citation preview
1© Copyright 2009 EMC Corporation. All rights reserved.
Data Warehousing Features in SQL Server 2008
James Rowland-Jones@jrowlandjones
2© Copyright 2009 EMC Corporation. All rights reserved.
Official DW Feature Set in SQL 2008
Build Manage Deliver InsightSQL Server RDBMS
MERGE statementChange data capture (CDC)Minimally logged INSERT
Backup compression
Star join performanceFaster parallel query on partitioned tablesGROUPING SETS
Resource governorData compressionPartition-aligned indexed views
Integration Services
Lookup performancePipeline performance
Analysis Services Backup MDX Query Performance: Block ComputationQuery & Write-back Performance
Scalable Shared DatabaseReporting Services
Reporting scalabilityServer scalability
3© Copyright 2009 EMC Corporation. All rights reserved.
JRJ’s DW Feature Set in SQL 2008
Build Manage Deliver InsightSQL Server RDBMS
MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types
Backup compression
Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART
Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes
Integration Services
Lookup performancePipeline performanceData Profiling Task
Analysis Services Backup MDX Query Performance: Block ComputationQuery & Writeback Performance
Scalable Shared DatabaseReporting Services
Reporting scalabilityServer scalability
4© Copyright 2009 EMC Corporation. All rights reserved.
What We’ll Focus On
Build Manage Deliver InsightSQL Server RDBMS
MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types
Backup compression
Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART
Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes
Integration Services
Lookup performancePipeline performanceData Profiling Task
Analysis Services Backup MDX Query Performance: Block ComputationQuery & Writeback Performance
Scalable Shared DatabaseReporting Services
Reporting scalabilityServer scalability
5© Copyright 2009 EMC Corporation. All rights reserved.
Data Compression
Enterprise Edition Only Row and Page Compression
Compression Ratio 2 to 1 or 3 to 1 - 50% to 70% reduction in data
Can be for a table, index or a subset of their partitions Estimate savings: exec sp_estimate_data_compression_savings Max row size plus compression overhead must not exceed 8060 bytes
6© Copyright 2009 EMC Corporation. All rights reserved.
Compression Alert
CompressedTable
UNCOMPRESSED TEXT
7© Copyright 2009 EMC Corporation. All rights reserved.
Monitoring Compression
SQL Server, Access Methods Object Page compression attempts/sec Pages compressed/sec
Compression Statistics for individual Partitions Dynamic Management Function sys.dm_db_index_operational_stats
8© Copyright 2009 EMC Corporation. All rights reserved.
DEMO TIME
Resource Governor (Quickly)Data Compression
9© Copyright 2009 EMC Corporation. All rights reserved.
P & P
Partitioning Parallelism
10© Copyright 2009 EMC Corporation. All rights reserved.
Partitioning & Parallelism
Partition Table Parallelism Few Outer Rows Parallelism Partition-Aligned Indexed Views
SQL 2005 behaviour – needs to be dropped before switch Switch Partition Pulls across indexed view
Rebuild index partition
11© Copyright 2009 EMC Corporation. All rights reserved.
What is a Partitioned Table?
P1 P4P3P2
SELECT
SUM(Sales_Qty) as Sales_Qty,
SUM(Sale_Amt) as Sales_Amount
FROM SalesDB.dbo.Tbl_Fact_Sales
WHERE date_id between '20050703' and '20050716'
12© Copyright 2009 EMC Corporation. All rights reserved.
The “Problem” in SQL 2005
Rows Executes StmtText1 1 SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales]
WHERE [date_id]>=@1 AND [date_id]<=@2
0 0 |--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END))
1 1 |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013])))
2 1 |--Parallelism(Gather Streams)2 12 |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty]
as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt])))
20577235 12 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006]))
2 12 |--Parallelism(Distribute Streams, Demand Partitioning)2 1 | |--Constant Scan(VALUES:(((80)),((81))))
20577235 2 |--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))
13© Copyright 2009 EMC Corporation. All rights reserved.
Partitioning & Parallelism Compared
P1 P4P3P2P2
P1 P4P3P2P2
SQL Server 2005
SQL Server 2008
14© Copyright 2009 EMC Corporation. All rights reserved.
Work Around for SQL Server 2005
Partition 4
Partition 3
UNION
SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050703' and '20050709'
SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050710' and '20050716'
15© Copyright 2009 EMC Corporation. All rights reserved.
Few Outer Rows Parallelism
SQL 2005 One thread given per page of rows on a nested loop join
SQL 2008 One thread given per row on a nested loop join
Good for Joins to Date Dim M$ internal DW Scale Benchmark perf increase by 30%
SELECT d.Date_Desc ,SUM(f.Sale_Amt*f.Sales_Qty)
FROM Tbl_Fact_Store_Sales fJOIN Tbl_Dim_Date dON f.sk_date_id = d.sk_date_id WHERE d.date_value between '10/1/2004' and '10/7/2004'GROUP BY d.Date_Desc
16© Copyright 2009 EMC Corporation. All rights reserved.
Work-Around’s for SQL Server 2005
STUFF YOUR ROW Add a JUNK Col on the Date dimension to force one row
per page
CLUSTER ON A GUID Add a column and populate with GUIDs to encourage
Rows onto separate pages
17© Copyright 2009 EMC Corporation. All rights reserved.
Partition Aligned Indexed Views
The Big Chore was “Sliding” a table with an indexed view on it.
In 2005 this needed to be dropped In 2008 it does not
18© Copyright 2009 EMC Corporation. All rights reserved.
IT’S DEMO TIME
Sliding Window with Indexed View in PlaceRebuild Partitioned IndexFiltered Indexes
19© Copyright 2009 EMC Corporation. All rights reserved.
STAR JOINS “Optimized” Bitmap Filters
What is a Bitmap filter– In memory structure (no index overhead)– Created dynamically– Typically quite small in size
Bitmap Filter SQL 2005– What it was in 2005...– Hash or Merge JOIN
Optimised Bitmap Filter SQL 2008– Enterprise Edition– Parallel Query– Hash JOIN only– Fact table must have > 100 pages– Single Column join (No PK FK relationship requirement)(integer needed for
optimized)– Dimension input cardinalities are smaller than fact input cardinalities– Look for Bitmap warning event for missed opportunities to use Bitmap
20© Copyright 2009 EMC Corporation. All rights reserved.
Minimally Logged Inserts & TF 610
21© Copyright 2009 EMC Corporation. All rights reserved.
Bulk Load Methods Compared
22© Copyright 2009 EMC Corporation. All rights reserved.
FOR THE FINAL TIME
STAR JOINSMinimally Logged INSERTS
Recommended