Upload
marcus-perry
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Performance Tuning SSIS
Brian Knight, CEO Pragmatic [email protected]
HR Departments are no fun.
Don’t mention the stalking incident with Clay AikenWhat happened in VegasMy prom date with a puppetMost unfortunate incident with a turtleMy fear of bounce housesHow to sexually harass the HR rep
• What I did to a fish when I was 8
• Any talk about my college years
• The surgery I had last summer
• The stint I had as a traveling gypsy
• Why am I still not allowed back Texas
• How what I did in Vegas truly can’t stay in Vegas
About Brian
Wasn’t very good with girlsEven Kermit the Frog founded a companyAll 15 still awaiting a publisher.Where he writes about his miniature donkey collection.
• SQL Server MVP• Founder of Pragmatic
Works• Author of 15 books• Blogs at BIDN.com
Twitter: @BrianKnight
4
GeoSpatial Data:Semi structured
Legacy data: binary files
Application database
Integration is a seamless, manageable operationSource, prepare, & load data in single, auditable processScale to handle heavy and complex data requirements
SQL Server Integration Services
GeoSpatialComponents
Customsource
Standardsources
Data-cleansingcomponents
Merges
Data miningcomponents
Warehouse
Reports
Mobiledata
Integration Services in Action
Cube
5
Advanced Session
6
Today’s Problems with Integration
Integration todayIncreasing data volumesIncreasingly diverse sources
Requirements reached the Tipping PointLow-impact source extractionEfficient transformationBulk loading techniques
7
Tuning DecisionsChoose the right tool for the jobDon’t be afraid to use T-SQLWill parallelism work?
8
Source OptimizationFlat files – When available, use Fast ParseOLE DB sources – Change network packet sizeUse T-SQL whenever possible in the OLE DB Source
JoiningNULL handlingWhere clauses
SQL ServerSQL Server
Network TrafficConnection Settings
Packet size defaults to 4096Increase to 32767 on large data sets
Database
SSISPackage S
WITCH
LAN LAN
LAN LAN
10
Impact of Compression on ETL
NONE ROW PAGE05
101520253035
0123456
BULK INSERT into a Heap with and without Data Compression
Time to BULK INSERT 50M rows (min)Table Size after Load (GB)
Compression Type
Tim
e (m
inut
es)
Tabl
e Si
ze a
fter
Loa
d (G
B)
* Not official Microsoft results.
Tuning the Source
Connection manager tuningFlat file tuningOLE DB Source tuning
Demo
12
Transform Components
x x xThe Pipeline presents the buffer to each downstream component
x x xx x xx x xx x xx x x
13
SSIS Data Flow Architecture
Synchronous vs. Non Synchronous
14
Case Study: Patterns
105 seconds 83 seconds
Demo
Cascading lookup optimizationsCache file lookup
21
Data Destinations
Use “Fast Load” or SQL Server DestinationTable Lock on insert operationsTrace flags for improvementOld principles still apply
Destination Tuning
Demo
23
Managing Resources
Logging events to watch pipeline internalsPipelineExecutionPlan, PipelineExecutionTree, BufferSizeTuning
System Monitor to track I/O issuesBuffers In Use tracks how many buffers are presently being usedBuffers Spooled tracks how many 10 mb buffers have been spooled to disk
Measuring PerformancePerfmon
25
Location
Consider the following configuration…
Where should SSIS run? (Licensing issues aside)
SQL Server 1 SQL Server 2
SSIS Server
26
WSRM
Windows System Resource Manager (WSRM) can throttle CPU and memory
Creates a soft throttleCan be scheduled so SSIS gets priority on weekends and nightsOnly activates policy if resources begin to become constrained (about 70%)WSRM is free with Windows Server 2003 Enterprise Edition and included in Windows Server 2008
WSRMCreating a soft schedule cap
Demo
Building a Work Queue System
Create a work queue table.
Create a loop to shift over the work queue constantly checking out work
Spawn x times with a batch file
Demo Results
1 2 3 4 5 6 7 8
Start 0 8.90046358108523E-05
0.000181203708052636
0.000271793986030389
0.000366006948752329
0.000462037038232666
0.000555555561732035
0.000652268521662336
Duration 8.8807872089092E-05
9.2013884568587E-05
9.05902779777537E-05
9.42129627219404E-05
9.58333330345341E-05
9.33217597776095E-05
9.67129599303008E-05
9.4409719167743E-05
00:04.3
00:12.9
00:21.6
00:30.2
00:38.9
00:47.5
00:56.2
01:04.8
1 Process finishes in 64 seconds
Elap
sed
Tim
e
1 2 3 4 5 6 7 8
Start 0 1.96756445802748E-07
0.000104016202385538
0.000104166661913041
0.000204745367227588
0.000205092590476852
0.000316666664730293
0.000316863421176096
Duration 0.000103819438663777
0.000103819445939735
0.000100729164842051
0.000100740740890615
0.000111736109829508
0.000111770830699243
0.000107060186564923
0.000108136577182449
00:04.3
00:12.9
00:21.6
00:30.2
00:38.9
00:47.5
00:56.2
01:04.8
2 Processes finish in 36 seconds
Elap
sed
Tim
e
Demo Results
1 2 3 4 5 6 7 8
Start 0 0 9.25923814065755E-07
1.07639061752707E-06
0.000154780092998408
0.000157326387125067
0.000162037038535346
0.000168715276231524
Duration 0.000154629626194946
0.000168518519785721
0.00016091435099952
0.000156064808834344
0.000159340277605225
0.000164745368238073
0.000156597219756804
0.000156597219756804
00:04.3
00:12.9
00:21.6
00:30.2
00:38.9
00:47.5
00:56.2
01:04.8
4 Processes finish in 28 seconds
Elap
sed
Tim
e
Demo Results
1 2 3 4 5 6 7 8
Start 0 0 6.94446498528126E-07
8.91202944330875E-07
8.91202944330875E-07
1.07639061752707E-06
1.42361386679113E-06
1.42361386679113E-06
Duration 0.000306331021420193
0.000306331021420193
0.000305439811199904
0.000314305558276829
0.000283564819255844
0.000312118056172041
0.000304907407553402
0.00031613426108379
00:04.3
00:12.9
00:21.6
00:30.2
00:38.9
00:47.5
00:56.2
01:04.8
8 Processes finish in 27 seconds
Elap
sed
Tim
e
Demo Results
Parallel Load
Demo
Summary
PlanningDon’t underestimate the power of the whiteboard!
Use the right tool for the right jobLeverage the power of the engine
Patterns and PracticesUnderstand best practicesBut don’t be afraid to experiment
35
The End Already?
Questions
http://www.bidn.com/people/brianknight
@BrianKnight
http://www.youtube.com/pragmaticworks