Upload
virginia-palmer
View
242
Download
18
Embed Size (px)
Citation preview
SQL Server: Performance & Balanced System DesignBy Frank [email protected]
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Dealing with ComplexityKey Theme Simple but NOT simplisticStatic configuration parameters replaced with dynamic algorithms employing adaptive feedbackAdministrative control retained to manage system wide resourcesYou can still constrain the amount of memory used by SQL Server if you want to
Dynamic ControlTime ControlValueInstantaneous optimalvalue of control value
Dynamic Control AnalogyIgnition Timing: user controlled by lever Classic knob
Vacuum Advance: Uses simple feedback Major advance
Full Feedback: Continuously adjusted Monitors: temp, speed, engine response, etc.
Memory, CPU & Clustering LimitsWindows Server 2003 and SQL Server 20001 With Address Windowing Extensions (AWE) in SQL Server 2000
The highly scalable database platform for memory intensive, performance-critical business applicationsOptimized for Windows Server 2003 and ItaniumGreat performanceLarge memory addressability (up to 32 TB) Nearly unlimited virtual memory (up to 8 TB)I/O savings due to larger memory buffer poolsT-SQL code-compatibility with SQL Server 20008 node clustering supportSame on-disk format as 32-bit for easy migrationOne setup for database & OLAP based on Windows Installer technologyCompelling alternative to expensive Unix solutions
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
PSPP Results as of Oct 26, 200310/08/03 - 5,000 Concurrent User on Unisys ES7000 server and Microsoft SQL Server 2000
10/07/03 - 32,000 Concurrent User on HP-UX servers
10/07/03 -12,000 Concurrent User on HP ProLiant and Integrity servers and Microsoft SQL Server 2000 (64-bit)
06/03/03 - 10,000 Concurrent User on HP-UX servers
04/24/03 - 30,000 Concurrent User on Unisys ES7000 / 2000 Series of servers and Microsoft SQL Server 2000 (64-bit)
02/05/03 - 5,000 Concurrent User on IBM eServer xSeries and Microsoft SQL Server 2000
10/21/02 - 4,500 Concurrent User on IBM eServer xSeries and IBM DB2 UDB
06/24/02 - 30,000 Concurrent User on IBM eServer pSeries and IBM DB2 UDB
30,000 concurrent UsersWorld-class performance
A typical PSPP environment: HP 12K
12,000 User Benchmark on HP/Windows/SQL64 resource utilization
Sheet1
NiodeFunctional UseAverage CPU (%) UtilizationAverage Memory Utilization (MB)
4 x Proliant DL760Web Server Application Requests8%600
3 x Proliant BL20pWeb Server Application Requests7%500
1 x Proliant DL760Web Server HTTP Adapter, WF9%400
1 x Proliant 6400RSiebel Gateway Server3%200
4 x Proliant DL580Siebel Application Server End Users13%5,000
8 x Proliant BL40pSiebel Application Server End Users11%4,700
1 x Proliant DL580Siebel Application Server EAI HTTP Adapter+ WF25%2,200
1 x Proliant DL760Siebel Application Server EAI-MSMQ Adapter21%916
1 x Proliant BL20pSiebel Application Server AM3%80
1 x Integrity rx5670Microsoft SQL Server 2000 (64-bit)47%13,300
12,000 User Benchmark on HP/Windows/SQL64 Concurrent Users
Server Component ThroughputSQL64 on a 4x 1.5 GHz Itanium2 HP Integrity used 47% CPU and 13.3 GB memory proving unprecedented price/performance for Siebel
Sheet1
WorkloadNumber of UsersAvg Operation Response Time (sec)Business Transactions Throughput / hourProjected Daily Transactions
Sales / Service Call Center20,0000.148122,041976,328
PRM4,0000.18227,615220,920
eSales3,0000.23317,134137,072
eService3,0000.19640,455323,640
Totals30,000N/A207,2451,657,960
WorkloadBusiness Transactions Throughput / HourProjected Transactions 8 Hour Day
Assignment Manager62,012496,096
EAI - HTTP Adapter496,0563,968,448
EAI - MQ Series Adapter294,5392,356,312
Workflow Manager116,944935,552
Sheet2
Sheet3
Sheet1
WorkloadNumber of UsersAvg Operation Response Time to Load Runner (sec)Business Transactions Throughput / HourProjected Transactions 8 Hour Day
Sales / Service Call Center8,4000.13743,662349,300
eChannel (PRM)1,2000.13116,130129,037
eSales1,2000.1448,16465,313
eService1,2000.16215,462123,694
Totals12,000N/A83,418667,344
WorkloadBusiness Transactions Throughput / hourProjected Daily Transactions
Assignment Manager38,599308,792
EAI - HTTP Adapter746,6765,973,408
EAI - MQ Series Adapter545,4724,363,776
Workflow Manager96,299770,392
Sheet2
Sheet3
Oracle supported 10K users on rp8400 with 16x CPU 875Mhz with Oracle 9.x/Hp-ux posting 35% CPU and 18.2GB memory.
MSSQL supported 12K users on rx5670 with 4x CPU 1.5Ghz with sql2K/windows2003 posting 47% CPU and 13.3GB memory. Result:17.95% less CPU** 26% less memory60% less cost20% more users (12K vs 10K)
SQL Server 2000 64-bit did more with less.
*HP rx5670 is around $50K on HP web if you pay the full price.*HP rp8400 base price $124K** rp8400 16 CPU SpecInt 98.2, rx5600 4 CPU SpecInt 60** (cont) (98.2 * 35%) = 34.37 , (60 * 47%) = 28.2** (cont) 1 (28.2/34.37) = 17.95%** www.spec.org** Scott Hall SlideOracle 10K vs SQL Server 12K on the DB-tier
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Tuning ToolsWindows System Monitor PERFMON.EXESystem Monitor in the Performance console, started from Administrative Tools Query Analyzer ISQLW.EXEGraphical showplan Statistics profile
Profiler- PROFILER.EXESpot problematic queriesUse the Tuning or Duration templatesMonitor the overhead carefully on your system
Index Tuning Wizard Particularly for EIM initial
Exploring I/O with System MonitorMake sure you set DISKPERF Y on command line to get counters dumped
Performance Object: Physical DiskCounters:%Disk Time Number for entire logical drive should be less than 100%Avg Disk Queue Length (system requests on avg. waiting for disk access)Want 0If it is above 2 (def. above 3), look into itSee if its sustained queueing or temporary Avg. Disk Read/Write /sec (diff counters, and remember Logical vs. Physical)Nice to have: 5 7 ms (might be optimistic)Realistic (todays technology): 20 25 ms on a moderately loaded systemLog device service write times should be below 20 msTechnology dependent
See BOL (index Bottlenecks then Monitoring Disk Activity) for some more tips
System MonitorUseful CountersProcessor - % Processor TimePhysical Disk - %Disk Time, Avg. Disk Queue LengthMemory Available MBytesSystem Context Switches / secSQL Server Locks Lock Waits/sec, Number of Deadlocks/secSQLServer: Access Methods Full Scans/sec, Page Splits/sec, Table Lock Escalation/secSQLServer: Buffer ManagerBuffer Cache Hit Ratio, Lazy Writes/sec, Page Reads/sec, Page Writes/sec, ReadAhead Pages/secSQLServer: Databases - Transactions/secSQLServer: General Statistics - User ConnectionsQ150934 How to Create a Performance Monitor Log
Profiler TerminologyTemplateDefines criteria for what to monitorSaved in .tdf fileTraceCaptures data based upon selected events, data columns, and filters FilterLimits the results (Equal, Not like, etc)Event CategoryDefines the way events are groupedEventAction generated within SQL engine
ProfilerUse Built-in templatesFind the worst-performing queriesFilter by duration Identify the cause of a deadlock Monitor stored procedure performanceAudit activity C2 auditsReorder your output columns by Duration, CPU, read, writes, textdata, etc.
Profiler in ProductionCan be very CPU intensiveMy experience in production: 8x at 100%Filter! Filter! Filter!Let the computer tell you whats going foulTo proactively find out what is going wrongFilter on Duration > 30,000msRun for 24 hoursThis will show you all the poor running queries on your system.Cant make it run faster? Look at IOs in loops or running all the time. Repack to higher fill factor. 5 to 4 IOs is a 20% increase.McBaths Oil ArgumentDo you notice your car running better when you change your oil? No. But your car sure does. Make your database server run as efficient as possible. Makes it more scalable. More with less.
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Siebel DatabaseOver 2,300 TablesMany with over 120 columns per table2,200 Clustered Indexes (ROW_ID)10,500 Non-Clustered IndexesOver Indexed, many NULL indexes10,000+ Default Constraints25,000+ Check ConstraintsVery few Stored ProceduresSome Triggersused by workflow / assignment manager
Siebel Logical Schema vs. Physical SchemaSiebel database schema is designed as a cross platform logical schema, which is managed by Siebel ToolsThe logical schema is translated to a physical schema by using Siebel database utilities, such as DDLIMPThe logical schema may be altered and mapped to a compatible physical data type depending on the DB platform and code page
Dont Be Afraid to Add/Change IndexesYou almost always have to.Work closely with Siebel Expert Services Make sure your Siebel meta data is in sync with SQL Server meta data or bad things can happen next time you DDLSYNCSee examples in the next slides
Reality (I)Out of Box
sp_helpindex EIM_CONTACT3
EIM_CONTACT3_T01EIM_CONTACT3_T02EIM_CONTACT3_T03EIM_CONTACT3_T04EIM_CONTACT3_T05EIM_CONTACT3_T06EIM_CONTACT3_T07EIM_CONTACT3_T08EIM_CONTACT3_T09EIM_CONTACT3_T10EIM_CONTACT3_T11EIM_CONTACT3_T12EIM_CONTACT3_T13EIM_CONTACT3_T14EIM_CONTACT3_T15EIM_CONTACT3_T16EIM_CONTACT3_T17EIM_CONTACT3_T18EIM_CONTACT3_T19EIM_CONTACT3_T20EIM_CONTACT3_U1A Large Customer
sp_helpindex EIM_CONTACT3
EIM_CONTACT3_T01 EIM_CONTACT3_U1
Reality (II)S_CONTACT_EIS_CONTACT_F10S_CONTACT_F11S_CONTACT_F12S_CONTACT_F13S_CONTACT_F15S_CONTACT_F2S_CONTACT_F3S_CONTACT_F4S_CONTACT_F5S_CONTACT_F6S_CONTACT_F7S_CONTACT_F8S_CONTACT_IIS_CONTACT_M1S_CONTACT_M11S_CONTACT_M12S_CONTACT_M13S_CONTACT_M14S_CONTACT_M15S_CONTACT_M16S_CONTACT_M17S_CONTACT_M18S_CONTACT_M19S_CONTACT_M2S_CONTACT_M20S_CONTACT_M21S_CONTACT_M22S_CONTACT_M3S_CONTACT_M4S_CONTACT_M6S_CONTACT_M8S_CONTACT_M9S_CONTACT_P1S_CONTACT_U1S_CONTACT_U2S_CONTACT_V1S_CONTACT_V2S_CONTACT_V3S_CONTACT_V5
S_CONTACT_EI S_CONTACT_F6_X S_CONTACT_II S_CONTACT_M1 S_CONTACT_M50 S_CONTACT_M8 S_CONTACT_ML1_X S_CONTACT_ML2_X S_CONTACT_ML3_X S_CONTACT_ML4_X S_CONTACT_ML5_X S_CONTACT_ML6_X S_CONTACT_P1 S_CONTACT_PREM01_X S_CONTACT_PREM02_X S_CONTACT_U1 S_CONTACT_U2 S_CONTACT_V3
Indexes in RED were custom.
Poor IndexesGet rid of 100% NULL indexesCost a lot on INSERTS/UPDATES/DELETES, ex. EIMCost disk spaceCost tape space when you back them upOnly time one might be used: on an aggregate. Its cheaper than a full scan. Rare, though.Stored Procedure below will examine the indexes on the top 100 tables for indexes that probably will not be used.
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Siebel OM Query Execution and Fetching Mechanism20 tables in a join Siebel OM uses Server side API cursorsFor List applet functionality i.e. to maintain user state and support pending result setsTo support multiple active statements per connectionsFast Forward cursor with auto-fetch or Dynamic cursor when accessing text columnsSometimes there is an implicit conversions to Keyset (order by not covered by index )Average fetch size is 3 or 4 rows this is computed by dividing the ODBC buffer size by row sizeSiebel uses ODBC Prepare/ExecuteExample: Select * from table where x = ?What it looks like
SQL Fetch MechanismSiebel OM3 rowssp_cursorprepex (.., 3)
sp_cursorfetch (,3)3 rows
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
declare @P1 intset @P1=-1declare @P2 intset @P2=0declare @P3 intset @P3=28688 Fast Forward, Parameterized, Auto Fetch, Auto Close (undocumented and subject to change)declare @P4 intset @P4=8193declare @P5 intset @P5=10 exec sp_cursorprepexec @P1 output, @P2 output, N'',N'SELECT T1.LAST_UPD_BY, T1.ROW_ID, T18.PRTNR_TYPE, T13.CREATED_BY, T2.ASGN_USR_EXCLD_FLG,... T2.PAR_OU_ID FROM dbo.S_PARTY T1 INNER JOIN dbo.S_ORG_EXT T2 ON T1.ROW_ID = T2.PAR_ROW_ID INNER JOIN dbo.S_ACCNT_POSTN T3 ON T2.PR_POSTN_ID = T3.POSITION_ID AND T2.ROW_ID = T3.OU_EXT_ID INNER JOIN dbo.S_PARTY T4 ON T3.POSITION_ID = T4.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T5 ON T2.PAR_OU_ID = T5.PAR_ROW_ID LEFT OUTER JOIN dbo.S_PRI_LST T6 ON T2.CURR_PRI_LST_ID = T6.ROW_ID LEFT OUTER JOIN dbo.S_POSTN T7 ON T2.PR_MGR_POSTN_ID = T7.ROW_ID LEFT OUTER JOIN dbo.S_USER T8 ON T7.PR_EMP_ID = T8.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T9 ON T2.PAR_BU_ID = T9.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T10 ON T1.PAR_PARTY_ID = T10.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_PRTNR T11 ON T1.ROW_ID = T11.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT_SS T12 ON T1.ROW_ID = T12.PAR_ROW_ID LEFT OUTER JOIN dbo.S_BU T13 ON T1.ROW_ID = T13.PAR_ROW_ID LEFT OUTER JOIN dbo.S_OU_PRTNR_TIER T14 ON T2.PR_PRTNR_TIER_ID = T14.ROW_ID LEFT OUTER JOIN dbo.S_ASGN_GRP T15 ON T2.PR_TERR_ID = T15.ROW_ID LEFT OUTER JOIN dbo.S_INDUST T16 ON T2.PR_INDUST_ID = T16.ROW_ID LEFT OUTER JOIN dbo.S_ADDR_ORG T17 ON T2.PR_ADDR_ID = T17.ROW_ID LEFT OUTER JOIN dbo.S_OU_PRTNR_TYPE T18 ON T2.PR_PRTNR_TYPE_ID = T18.ROW_ID LEFT OUTER JOIN dbo.S_POSTN T19 ON T3.POSITION_ID = T19.PAR_ROW_ID LEFT OUTER JOIN dbo.S_USER T20 ON T19.PR_EMP_ID = T20.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_SYN T21 ON T2.PR_SYN_ID = T21.ROW_ID LEFT OUTER JOIN dbo.S_ORG_BU T22 ON T2.BU_ID = T22.BU_ID AND T2.ROW_ID = T22.ORG_ID LEFT OUTER JOIN dbo.S_PARTY T23 ON T22.BU_ID = T23.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T24 ON T22.BU_ID = T24.PAR_ROW_ID WHERE ((T2.PRTNR_FLG != ''N'') AND ((T13.BU_FLG = ''N'' OR T13.BU_FLG IS NULL) AND T2.PRTNR_FLG = ''Y''))OPTION(FAST 40)', @P3 output, @P4 output, @P5 output
Typical Siebel OM QueryBuild plan and return 40 rows ASAP
Why do I get a different query plan in the Query Analyzer ?Bind value (Prepare Execute model)Hard Coding Values instead of binding at Run TimeCursor (SQL Server API cursor)Not putting the ODBC WrapperSQL hint (Fast 40)Not including compiler optionsText columnImplicit Cursor ConversionTable Spools in one plan, but not the otherCapture on Implicit Cursor Event in ProfilerAlso capture on integer data column
(N)TEXT Columns(N)TEXT column may cause performance problemsIn the Siebel database schemaA logical TEXT data type is always translated to a physical (N)TEXT columnA VARCHAR data type can be translated to either a (N)VARCHAR column or a (N)TEXT columnVARCHAR(2000+) is translated to a (N)TEXT column
One size fits all: Implicit Cursor ConversionsOne type of cursor is requested, but it cannot be fulfilled in its native call.Rather than fail, SQL Server converts internally.Performance problems.For example, Siebel uses an option fast 40Fast forward, read only requested with an ORDER BY, yet no index on the WHERE clause that is ordered. SQL Server converts to a KEYSET cursor which spools off to TEMPDB for the sort.Fix: make an index that matches the ORDER BY.KEYSET conversion goes away.SQL Profiler: Event -> Cursors -> CursorsImplicitConversion
Profiler Properties
ODBC implicit cursor conversions: BOL
Query Repro: Quick and DirtyCapture on RPC: Starting Event on ProfilerCut and paste it into Query Analyzer99% of the time it will give you the same plan that is coming out of Siebel.DONT: Spool Siebel out at the client, hard code the values and put into Query Analyzer. Probably wont workFor example, it wont have the OPTION FAST 40
How to make the Query in QAprint 'declaring variables'declare @P1 intdeclare @P5 intdeclare @P6 intset @P1=NULL-- set @P5=28688-- SCROLLOPT 28676 = 16384 (AutoClose) + 8192 (AutoFetch) + 4096 (Parameterized) + 4 (Forward Only)set @P5=28676set @P6=8193print 'running sp_cursorprepare'exec sp_cursorprepare @P1 output, N'@P1 varchar(30)'--, N'SELECT * FROM authors WHERE au_lname like @P1 OPTION (FAST 1)', N'SELECT * FROM authors WHERE au_lname like @P1', 1, @P5 output, @P6 outputprint 'declaring more variables'declare @P2 intdeclare @P3 intdeclare @P4 intset @P3 = 1set @P2=NULLset @P3=24592-- SCROLLOPT - 24592 = 16384 (AutoClose) + 8192 (AutoFetch) + 4 (Forward Only)set @P4=8193set @P5=15print 'executing cursor'exec sp_cursorexecute @P1, @P2 output, @P3 output, @P4 output, @P5 output, 'R%'print 'select the results from the cusor execute'select @P1,@P2,@P3,@P4, @P5, @P6
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Siebel and Update StatisticsProblem:Siebel queries join a lot of tablesA lot of the tables joined might be smaller than the threshold to execute an automatic update statisticsResult: Bad plans for the joinStale statistics and EIMA Plan that tips over. 98 jobs run in 5 minutes. 2 run in 1 hour or a lot longer.EIM running faster than Auto Update Stats can kick off.Solution run a manual update statistics on those tablesCheck rowmodctr in sysindexes for indid=1See Appendix for code on auto update stats
Example of Stale Statisticsset statistics io onTable S_CONTACT'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0.Table S_OPPTY'. Scan count 4382, logical reads 21439, physical reads 650, read-ahead reads 877.Table S_PARTY'. Scan count 2, logical reads 17573, physical reads 199, read-ahead reads 0.After UPDATE STATISTICS with sample of 10%:Table S_CONTACT'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0.Table S_OPPTY'. Scan count 192, logical reads 1440, physical reads 0, read-ahead reads 11.Table S_PARTY'. Scan count 4, logical reads 1507, physical reads 3, read-ahead reads 59.
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Why Wait Types?Wait types will help define where your bottleneck is.They are seen in the master..sysprocesses table in a column called waittype.select waittype, * from master..sysprocessesThere are all kinds of waittypes. For example, blocking due to database locks, network io, disk queueing, etc The key to solving throughput is understanding whats damming up the river.
Wait Types
If a thread goes into a sleep status, a wait state is set. The wait state is contained in master..sysprocesses in the columns waittype, and lastwaittype. Lastwaittype is a character description of the last wait state for this thread. It is not reset until another wait state occurs. Waittype is a varbinary wait state that is the current wait state. A wait time of 0 means the thread is currently running.See SQL_PERF.DOC for detailed information:
Customer Review
2SQL Server Performance Review
2Performance Overview
2Waits & Queues: A Performance Methodology
3Wait Types
3Sysprocesses
3Track_waitstats stored procedure
6Track_waitstats Sample output
7Wait Types and correlation to other Performance info
17QUEUES (Perfmon Counters)
17PERFMON Counters, correlation, possible conclusions & actions
28Interesting PERFMON Ratios & comparisons
30Application Design issues
31Conclusion: Waits & Queues Analysis
32Appendix A: References
32Appendix B: CustomerApplicationName benchmark
32Benchmark environment
33Tests Description
33Test Results
33Next steps
34Appendix C: IO
34Quick overview of IO subsystems
34File & Table level IO
SQL Server Performance Review
Performance Overview
The performance of SQL Server 2000 database applications should be evaluated from several different perspectives. Each tells a different portion of the performance story. Together they paint a detailed performance picture of the whole. The first perspective using PERFMON counters presents performance from a resource point of view. SQL SERVER object counters are exposed to PERFMON using the system table master..sysperfinfo. Secondly, SQL wait types identify and categorize user (or thread) waits from a application workload perspective. Finally, associations or correlations of wait types to performance counters, as well as interesting performance counter ratios round out the picture.
Each PERFMON object has counters that are used to measure various aspects of performance, such as transfer rates for disks or the amount of processor time consumed for processors. Perfmon counters (including system, physical disk, etc.) provide a view of performance from a resource standpoint while SQL waits provide a view of performance from a user connection (or application) perspective.
SQL Server 2000 tracks wait information for each user connection. This information is summarized and categorized across all connections so that a performance profile can be obtained for a given work load.
Correlations of wait types to perf counters, and specific ratios of perfmon counters form the basis for an application performance methodology called waits and queues.
Waits & Queues: A Performance Methodology
Application performance can be simply explained by looking at waits and queues. Dbcc sqlperf(waitstats) provides a valuable source of wait information from a thread (or application) point of view. PERFMON on the other hand, provides a breakdown of system resource usage in terms of resource queues.
Requests for system resources such as IO, are made by user connections or threads. If those requests cannot be immediately satisfied, a queue of requests will wait until resources are available.
Wait Types
If a thread goes into a sleep status, a wait state is set. The wait state is contained in master..sysprocesses in the columns waittype, and lastwaittype. Lastwaittype is a character description of the last wait state for this thread. It is not reset until another wait state occurs. Waittype is a varbinary wait state that is the current wait state. A wait time of 0 means the thread is currently running.
Sysprocesses
Each user has an associated row in the system table master..sysprocesses. The stored procedure sp_who provides a list of these user connections or threads as well as other connection information such as command, resource, wait types, wait time and status. When a thread waits, the columns waittype (binary(2)), waittime (int) and lastwaittype (nchar(32)) and waitresource.. The values for waittype and lastwaittype columns are set by memory structures in SQL Server.
Lastwaittype is a character description of the last wait type for this thread. It is not reset until another wait state occurs. Thus, a non-blank lastwaittype means the thread had at least one wait state.
The current wait status is recorded in the waittype column. If the waittype is non-zero, the lastwaittype and waittype will be equivalent and indicate the current waitstate for the SPID. If waittype is 0x00, this means the thread is currently running.
Track_waitstats stored procedure
Track_waitstats is a stored procedure that will capture waitstats from DBCC SQLPERF, and provide a ranking of descending order based on percentage. This is useful in identifying the greatest opportunites for performance improvements. See the sample output below:
create proc track_waitstats (@num_times int=5,@delaymin int=1)
as
-- @num_times is the number of times to capture waitstats, default is 5 times
-- default delay interval is 1 minute
-- create waitstats table if it doesn't exist, otherwise truncate
set nocount on
if not exists (select 1 from sysobjects where name = 'waitstats')
create table waitstats ([Wait Type] varchar(80),
Requests numeric(20,1),
[Wait Time] numeric (20,1),
[Signal Wait Time] numeric(20,1),
now datetime default getdate())
else truncate table waitstats
dbcc sqlperf (waitstats,clear) -- clear out waitstats
declare @i int,@delay varchar(8),@now datetime, @totalwait numeric(20,1),@endtime datetime,@begintime datetime
select @i = 1
select @delay='00:' + right('0'+convert(varchar(2),@delaymin),2) + ':00'
while (@i 20%, check on %Processor Time & % User Time.
%Processor Time
% of time CPU is executing over sample interval.
Common uses of CPU resources:
1. Compilation and re-compilation use CPU resources. Plan re-use and parameterization minimizes CPU consumption due to compilation. For more details on compilation, recompilation, parameterization and plan re-use, see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_queryrecompilation.asp
--Plan re-use is where usecounts are > 1
select dbid, objid, cacheobjtype, objtype, usecounts, sql
from master..syscacheobjects
order by dbid,objid, cacheobjtype,objtype,usecounts,sql
Correlate to PERFMON counters:
1. System: Processor Queue length
2. SQL Statistics: Compilations/sec
3. SQL Statistics: re-Compilations/sec
4. SQL Statistics: Requests/sec
If both of the following are true, you are cpu bound:
1. Proc time > 85% on average
2. Context switches (see system object) > 20K / sec
light weight pooling can provide a 15% boost. Lightweight pooling (also known as fiber mode) divides a thread into 10 fibers. Overhead per fiber is less than that of individual threads.
%Idle Time
% of time CPU is idle over sample interval
Interrupts/sec
Interrupts/sec is the average rate, in incidents per second, at which the processor received and serviced hardware interrupts.
Correlate with other perfmon counters such as IO, Network.
Thread
Process
Page Faults
This counter includes both hard faults (those that require disk access) and soft faults (where the faulted page is found elsewhere in physical memory.) Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays. See the disk component for more information.
Check for memory pressure (see SQL Server buffer manager), low data page hit rates, & memory grants pending.
System
Usage
Processor Queue Length
Number of threads waiting to be scheduled for CPU time. Some common uses of CPU resources that may be avoidable:
1. Unnecessary compilation and recompilation. Parameterization and plan re-use would reduce CPU consumption. See http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_queryrecompilation.asp
2. memory pressure
3. lack of proper indexing
Context Switches/sec
SQL Server
Access Method
Forwarded Records/sec
Number of records fetched through forwarded record pointers.
Tables with NO clustered index. If you start out with a short row, and update the row creating a wider row, the row may no longer fit on the data page. A pointer will be put in its place and the row will be forwarded to another page.
Look at code to determine where the short row is inserted followed by an update.
Can be avoided by:
1. Using Default values (so that an update will not result in a longer row that is the root cause of forwarded records).
2. Using Char instead of varchar (fixes length so that an update will not result in a longer row
Full Scan/sec
Entire table or index is scanned. Scans can cause excessive IO if an index would be beneficial.
SQL Profiler can be used to identify which TSQL statements do scan. Select the scans event class & events scan:started and scan:completed. Include the object Id data column. Save the profiler trace to a trace table, and then search for the scans event.
The scan:completed event will provide associated IO so you can also search for high reads, writes, and duration.
Index Searches/sec
Number of index searches. Index searches are used to start range scans, single index record fetches, and to reposition within an index.
Compare to Full Scan/sec. You want to see high values for index searches.
Page Splits/sec
Number of page splits occurring as the result of index pages overflowing. Normally associated with leaf pages of clustered indexes and non-clustered indexes.
Page splits are extra IO overhead that results from random inserts.
When there is no room on a data page, and the row must be inserted on the page (due to index order), SQL will split the page moving half the rows to a new page, and then insert the new row.
Correlate to Disk: page sec/write. If this is very high, you may reorg the index(es) on the table(s) causing the page splits, to reduce page splits temporarily. Fillfactor will leave a certain amount of space available for inserts.
Memory Mgr
Memory Grants Pending
Memory resources are required for each user request. If sufficient memory is not available, the user will wait until there is enough memory for the query to run.
Compare with Memory grants outstanding. If grants pending increases, you can do the following:
1. add more memory to SQL Server
2. add more physical memory to the box.
3. check for memory pressure see & correct indexing if you experience out of memory conditions.
Correlate to Waittype
1. RESOURCE_SEMAPHORE
Buffer Manager
Buffer cache hit ratio
Percentage of time the pages requested are already in cache
Check for memory pressure. See Checkpoint pages/sec, Lazywrites/sec and Page life expectancy.
Checkpoint pages/sec
Pages written to disk during the checkpoint process, freeing up SQL cache
Memory pressure is indicated if this counter is high along with high lazy writes/sec and low page life expectancy (
Track_waitstats stored procedure
Track_waitstats is a stored procedure that will capture waitstats from DBCC SQLPERF, and provide a ranking of descending order based on percentage. This is useful in identifying the greatest opportunites for performance improvements. See Appendix for Stored Procedure on PerfSee the sample output below:
Sample Output
CommentaryThe above sample shows the majority share of wait time, 48%, being due to network IO waits. Improving network IO is the single largest opportunity for improving application performance. Other lesser opportunities in the above example include LCK_M_X (exlusive locks) and WRITELOG (transaction log). Exclusive lock waits account for almost 13% of total wait time. An examination of transaction management may offer clues as to whether improvements can be made here. WRITELOG means threads are waiting for physical writes to complete to the transaction log. Given the 11% writelog waits, a further analysis of PERFMON disk queues for the transaction log will confirm whether the IO capacity of the transaction log drives have trouble keeping up with write requests as shown by steady and high disk queues.
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
Locating the Query Using the Most CPUPerfmonCounters:Threads: % Processor TimeID Process (NT PID)ID Thread (KPID)select spid from master..sysprocesses where kpid = Select all SQLSERVR/0 to SQLSERVR/99DBCC INPUTBUFFER(spid)
Quick Estimate for Long Running Queriesselect datediff(mi, last_batch,getdate())'minutes',spid, waittype, cpu, physical_io, convert(char(15),hostname), convert(char(15),program_name), convert(char(20),getdate()),spid, last_batch, cmdfrom master..sysprocesseswhere spid > 50 andcmd not like '%WAIT%' anddatediff(mi, last_batch,getdate()) > 1order by last_batch
Questions?
AgendaIntroductionBenchmarkingTools & MonitoringSiebel DatabaseSiebel QueriesSiebel Query ReproSiebel & StatisticsWait TypesLocating Long Running QueriesOptimizing Data LoadingBalanced System Design- Network Performance- IO- Memory- File ManagementSummary
EIM & Data LoadingIm short changing you I have a 2 hour presentation just on this subject!The following are just some good ideas on generic loading of data.Attached is a 50 ppt presentation on EIM and SQL ServerPlease contact me off line for information on this topic. Glad to work with you on it.
Data Loading Best Practices
By Frank [email protected]
Why This Presentation?
Everyone does EIM.All databases have problems doing it.What EIM does.
Symptoms of EIM?
The first several thousand will always go fast.Performance deteriorates over time in a logarithmic pattern.Why? Because the b-trees grow and more levels to traverse.The curve grows logarithmically, *NOT* arithmetically. Ex. The first 5K rows go in at a rate of 2 minutes. After 2 weeks of loading that same 5K rows takes 1 hour.
Chart1
22
44.1
66.305
88.62025
1011.0512625
1213.603825625
1416.2840169062
1619.0982177516
1822.0531286391
2025.1557850711
2228.4135743247
2431.8342530409
2635.4259656929
2839.1972639776
3043.1571271765
3247.3149835353
3451.680732712
3656.2647693476
3861.078007815
4066.1319082058
4271.4385036161
4477.0104287969
4682.8609502367
Rows
Minutes
Sheet1
25000RowsMinutesMinutes
500002020500022
100000401000044.1
150000601500066.305
200000802000088.62025
250000100250001011.0512625
300000120300001213.603825625
350000140350001416.2840169062
400000160400001619.0982177516
450000180450001822.0531286391
500000200500002025.1557850711
550000220550002228.4135743247
600000240600002431.8342530409
650000260650002635.4259656929
700000280700002839.1972639776
750000300750003043.1571271765
800000320800003247.3149835353
850000340850003451.680732712
900000360900003656.2647693476
950000380950003861.078007815
10000004001000004066.1319082058
10500004201050004271.4385036161
11000004401100004477.0104287969
11500004601150004682.8609502367
Sheet1
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Rows
Minutes
Sheet2
Sheet3
Phases in Data Loading
Get flat fileScrub the dataLoad the EIM_* tablesRun EIMCheck resultsArchive and restartWhat everyone wants to do load 100M rows and leave
Efficiently Loading Data (I)
Load into pre-staging tablesScrub in tempdbMinimal loggingRecovery model for EIM during the load for the siebeldb.ALTER DATABASE siebeldb SET RECOVERY SIMPLEImplications. Be very aware of them. Read BOL.Scrub in SQL Server:Efficiencies of SQL Server caching, memory management, cpu usage, and a big database backend server.Use set wise processing, not cursorsIf have to use cursors, use Fast Forward/Read OnlyPL/SQL Consider use of NOLOCK for better performance. Dirty Reads. Works well if you are the only one on the server, etcExample
Efficiently Loading Data (II)
BULK INSERT vs BCPBULK INSERT is in memoryBCP is more configurableBoth are single threadedOnly run on one CPURun multiple BULK INSERTS at once across multiple CPUs. If the order of the data is not a concern, or you'd rather take the hit when creating the index, it's best to run BULK INSERT into the EIM tables in parallel by deploying a separate thread per CPU. You can use the TABLOCK hint to allow multiple non-blocking inserts.
Efficiently Loading Data (III)
Rules of thumb for previous: Use only 2 - 3 threads at max (only if you have the processors) Limit the batch Commit Size (batch size) to about 2,000 rows per batch.Adjust up or down based on your testing. Remember, if loading in clustered index sequence, only use one thread. Bulk operations are very high performance. They do log.Conditions in BOL (ex. TABLOCK)Try and load the data pre sortedRun all local on the database server. Not distributed over the network.
Efficiently Loading Data (IV)
Disable any triggers on the databases (such as workflow triggers) and then re-apply/enable them after the process is done. If this is not for an initial data load, this means that Workflow Manager or Assignment Manager will not function during the load for the new or updated data.
Bulk Loading
High Performance Data Loading Presentation:
High Performance Data Loading for SQL Server
Gert E.R. Drapers Group Manager SQL Server DTS Development Team Microsoft Corporation
Agenda
Data Loading OptionsData Loading StrategiesOpen Doors & Best PracticesExamples
Data Loading Options
INSERTSingleton vs. BatchedINSERT SELECT, INSERT EXECSELECT INTOBULK INSERTBCPDB-Library vs. ODBCOLE-DB IRowsetFastloadBulk XML
The Issues
Column NULLABILITYVariable columns (VARCHAR vs. CHAR) ConstraintsTriggersIndexesClustered (CI) vs. Non-clustered (NCI)Implicit data type conversionsLockingLogged vs. non (minimal) logged
Non-Logged Operations
Should be named "MINIMAL" logged!In a "minimal" logged situation only page extends are logged, to undo page allocations during rollbackPossible "Non-Logged" Operations:SELECT INTOBULK INSERTCREATE INDEXWRITETEXT, UPDATETEXTTRUNCATE TABLEBCP, IRowsetFastLoad
Recovery Models
Full Everything is fully loggedBulk_Logged Minimal logging for some operationsCREATE INDEXBulk LoadSELECT INTOWRITETEXT, UPDATETEXTNot settable per operation due to administration impactSimple Log truncation on checkpoint
Minimal Logged Bulk Copy
Requirements:Recovery model simple or bulk-loggedTarget table is not being replicated.Target table does not have any triggers.Target table has either 0 rows or no indexes.TABLOCK hint is specifiedIf requirements are not met, all record inserts are logged
INSERT
INSERT statements are always:Logged operationsApplying defaultsFiring triggersChecking constraintsControllable partsBatch sizeHow many INSERT statements are send in one batch, controls roundtrips to serverTransactionsSET XACT_ABORT
SELECT INTO
Destination table can not existTherefore no concurrency problems, can safely take a TABLOCK!Therefore no indexes presentInline data conversion possible by using CONVERT in column listNon-Logged, if the criteria are met, otherwise logged as singleton INSERT statements
BCP.EXE
Using ODBC BCP API in 7.0 and 2000Native format is most optimalLess parsing efforts/costNo string to data type conversions!Especially string to: datetime, float, real, decimal etc. are expensiveUse load hintsTABLOCKDefault = table option "table lock on bulk load" ORDER (column [ASC | DESC] [,...n]) Increase packet size to 64KB
BULK INSERT
Fastest input into server, everything is local to the serverIf table has existing data and is a HEAP, the BULK INSERT starts on new extend to provide lightweight ROLLBACKOrdered input dataBULK INSERT pubs..authors2 FROM 'c:\authors.txt' WITH ( DATAFILETYPE = 'char', FIELDTERMINATOR = ',', ORDER (au_id ASC) )
SQL 2000 Bulk Insert
New data types & per column collationOptional firing of triggers during loadConstraint checksDuring loadingNew command to check after loadingBiggest change is operationalRecovery modelsEasy to trade off performance vs. concurrencyEasy to maintain recoverability
BCP vs. BULK INSERT
BCP.EXEImport/ExportLocal or RemoteAlways use NetLibrary!Version switch for down level native and char formats-V60, -V65 V70
BULK INSERTImport onlyServer basedStreams rowset into the serverNo support for down level version input formats (no -V)
Bulk Insert Performance
Scales linearly with number of CPUsImprovement nearly 100% for each CPU
Perfmon cntrsSQL DatabasesBulk Copy Rows/secBulk Copy Throughput/sec
500 MHz Xeon Typical
Chart3
4.5
8.9
13.2
17.5
21.8
26.1
Throughtput (MB/sec)
Number of Streams
MB/sec
BULK INSERT Throughput
Sheet1
BULK INSERT of Lineitem
Number of StreamsThroughtput (MB/sec)ScalingCPU
14.51~ 1 x 99.5
28.91.97~ 2 x 99.5
313.22.93~ 3 x 99.5
417.53.88~ 4 x 99.5
521.84.84~ 5 x 99.5
626.15.8~ 6 x 99.5
Sheet1
Throughtput (MB/sec)
Number of Streams
MB/sec
BULK INSERT Throughput
Sheet2
Sheet3
Parallel Data Loads
SQL Server allows data to be bulk copied into a single table from multiple clients in parallel using the BCP or BULK INSERTSteps:Set the database to Bulk-Logged Recovery if you usually use the Full Recovery modelSpecify the TABLOCK hintEnsure the table does not have any indexesOnly applications using the ODBC or SQL OLE DB-based APIs can perform parallel data loads into a single tableClustered index should be created before creating the non-clustered indexesNon-clustered indexes can be created in parallel by creating each non-clustered index from a different client/connection concurrently
Locking
If no lock hint is specified for BCP or BULK INSERT, the default uses row-level lockssp_tableoption, can set bulk load lock more per tableOff = Row-level locks usedOn = Table-level lock usedexec sp_tableoption [sales], 'table lock on bulk load', 'true'TABLOCK hint override sp_tableoption settingIt is not necessary to use the TABLOCK hint to bulk load data into a table from multiple clients in parallel, but doing will improve performance
Bulk XML
Added in SQL XML Web Release 1Implemented by SQLXMLBulkLoad
2 stage process: First, analyzes the mapping schema and prepares the necessary execution planSecond, applies the execution plan to the XML input data encountered in the input stream
set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad") objBL.ConnectionString = "provider=SQLOLEDB.1;data source=MyServer;database=MyDB;uid=MyUID;pwd=MyPWD" objBL.ErrorLogFile = "c:\error.log" objBL.Execute "c:\SampleSchema.xml", "c:\SampleData.xml" set objBL=Nothing
Bulk XML Schema
SampleSchema.xml
Bulk XML Data
SampleData.xml
1111 Sean Chai NY 1112 Tom Johnston LA 1113 Institute of Art
Data Loading Techniques
Initial load into empty tablesLoad DataNo indexesUse BULK INSERTParallel load from partitioned data filesOne load stream per CPUBulk_Logged or Simple recovery modelTABLOCK optionCreate indexesSwitch to appropriate recovery model Perform backups
Data Loading Techniques
For incremental loads Load data with indexes in placePerformance and concurrency requirements determine locking granularityRecovery model changesFull Bulk_Logged recovery modelUnless need to preserve point-in-time recovery Simple (no change)
Indexing Strategies
If no existing data and incoming data stream is sorted on CI keyCreate CI before load, no NCICreate NCI's after CI is createdUse CREATE INDEX from multiple sessions to create indexes in parallel If data exists in tableDrop the CI and NCI indexes and load into a HEAPCreate CI (1.5 times size of CI required)Create NCI's after CI is created
When to drop and recreate?
CREATE INDEX
FillfactorSORT_IN_TEMPDBSpecifies that the intermediate sort results used to build the index will be stored in the tempdb database. This option may reduce the time needed to create an index if tempdb is on a different set of disks than the user database, but it increases the amount of disk space used during the index build.
Parallel Index Creation
FastDegree of Parallelism ControllableHow It WorksSeparate thread builds a sub-index for each rangeEach thread fed by parallel scanSub-indexes stitched together at endOptional use of TEMPDB
ParallelScans
Sub-indexPerRange
Complete Index
Index Maintenance
Index Creation - Parallel and ConcurrentParallel for largest indexesDefault for SMPConcurrent for smaller indexesCombination for mixed indexes Consider Bulk_Logged recovery modelMaintenanceFast analysisOnline reorganization at appropriate thresholdAllow for larger log backupsSlower than rebuilding index
TEMPDB specialties
Recovery mode is always SIMPLE and can not be changed.Implies: trunc. log on chkpt.Collation & Sort Order are inherited from server (based on MODEL)For temp table with different collation/ sortorder, need to specify during creation to prevent conversion and/or data lossAuto Create and Update Statistics are by default on
Determine your bottleneck
Who is using the resources, the Loader program or SQL Server?CPU, Disk I/O, Memory, NetworkUse PerfMon (use DISKPERF)Common problems LoaderString parsingData Type conversions
Data Loading Open Doors
Pre-size database!Prevent (massive) file extends Load on the server, cut out networkIf you have to run over the netMax packet size to 64KBGet the fast line between A-B, dedicated segments/lines, GiganetIf data is sorted on CI keyLoad into empty table with clustered index and table lockIf data is not sortedLoad into empty table with no indexes Build CI first, build NCI second
Data Loading Open Doors
Use BULK INSERT in favor of BCP.EXEUse BCP.EXE of 7.0 or 2000, not 6.5Use the Native Unicode data format for moving data between SQL ServersOrder the data on Export using the BCP QueryOut optionUse TEMPDB for index creationEspecially when TEMPDB is located on a different drive/spindle Pre-size TEMPDBDon't put your load file on same drive/ spindle as database or t-log filesLoad file is 100% sequential data access
Last open door
Don't forget BACKUP or sp_detach_dbSometimes it will be faster to:Detach, copy files, attaches and select intoBackup, copy files, restore and select intoinformation between two SQL Servers databases
Examples
Data Loading ExamplesDTS Performance Characteristics
DTS Performance Tips
Bulk Insert vs. DTSTransform Data Task vs. DDQTweaks for fastloadSQL vs. CodeCustom Transforms vs. VBScriptDifferent Scripting LanguagesConnection ModelAvoid serialisation/main thread executionMDAC Multi-Processor Scalability
Performance-Transform Data Task Options
Sheet1
Time for 1 Million Rows
Default335
DB Options229
Batch 100,000219
Batch 10,000206
Batch 10k + Table Lock182
Parallelism169
Time for 1 Million Rows
DDQ3000
Batch 10k + Table Lock182
Time for 1 Million Rows
Ax Upper Case377
Custom Transform216
T-SQL204
Chart1
335229219206182169
Default
DB Options
Batch 100,000
Batch 10,000
Batch 10k + Table Lock
Parallelism
Load Options
1 Million Rows mixed Data
Chart2
3000182
DDQ
Batch 10k + Table Lock
Load Options
1 Million rows mixed data
Chart3
377216204
Ax Upper Case
Custom Transform
T-SQL
Transformation Options
1 Million rows mixed data
Performance-Transform Data Task vs. DDQ
Sheet1
Time for 1 Million Rows
Default335
DB Options229
Batch 100,000219
Batch 10,000206
Batch 10k + Table Lock182
Parallelism169
Time for 1 Million Rows
DDQ3000
Batch 10k + Table Lock182
Time for 1 Million Rows
Ax Upper Case377
Custom Transform216
T-SQL204
Chart1
335229219206182169
Default
DB Options
Batch 100,000
Batch 10,000
Batch 10k + Table Lock
Parallelism
Load Options
1 Million Rows mixed Data
Chart2
3000182
DDQ
Batch 10k + Table Lock
Load Options
1 Million rows mixed data
Chart3
377216204
Ax Upper Case
Custom Transform
T-SQL
Transformation Options
1 Million rows mixed data
Performance-Transformation Options
Sheet1
Time for 1 Million Rows
Default335
DB Options229
Batch 100,000219
Batch 10,000206
Batch 10k + Table Lock182
Parallelism169
Time for 1 Million Rows
DDQ3000
Batch 10k + Table Lock182
Time for 1 Million Rows
Ax Upper Case377
Custom Transform216
T-SQL204
Chart1
335229219206182169
Default
DB Options
Batch 100,000
Batch 10,000
Batch 10k + Table Lock
Parallelism
Load Options
1 Million Rows mixed Data
Chart2
3000182
DDQ
Batch 10k + Table Lock
Load Options
1 Million rows mixed data
Chart3
377216204
Ax Upper Case
Custom Transform
T-SQL
Transformation Options
1 Million rows mixed data
Conclusion
Many different ways to load dataBULK INSERT, if you load from fileBCP, if you load from file over the networkDTS, if you need to perform transformations and/or load from any other data source(s)Know your scales and volumes (current and projected)Know the rules of the gameTest, test, test, before going in production!
Questions?
Or e-mail [email protected]
Stale Statistics
What happens when stats get oldBad plans. A query that normally runs in a few seconds can take 30 minutes.How do the stats get stale?EIM updates every row in the EIM_* table. The process that auto updates stats doesnt wake up in time between runs.Small tables will never be updated.Correct this by running UPDATE STATISTICS between runs or a SQL AGENT job that wakes up and runs.Consider turning off auto update stats for the data load.Its all about getting predictable performance.100 runs 97 run in 5 minutes 3 run in 4 hours each at midnight when no one is around to kill it.Hard to pick a bad plan with only a clustered index.
Fragmentation Happens
During big data loadsRun DBCC REINDEX to correctThink about: Fill FactorPad IndexCheck with Perfmon:SQL Server: Access Methods -> Page Splits / secTrend over timeLook at sample script to defragDefrag & update stats between runsDBCC INDEXDEFRAG, REINDEX, and drop recreateSamples and run timesPros and consPossibly use file groups for initial load. For example, put the EIM tables in their own FG or put the indexes on their own FG.
Cursor vs. Set Wise
The following is based on a 9.5 million row table. 2 CPU. 2G RAM.
DECLARE scrub_cursor CURSOR FORSELECT yFROM table_xorder by y
OPEN scrub_cursor
FETCH NEXT FROM scrub_cursor INTO @yWHILE @@FETCH_STATUS = 0BEGIN then update table_x
Cursor
DECLARE scrub_cursor CURSOR FORSELECT yFROM table_x order by y
-- 4:23 minutes-- normal: ------------------------------------------------------ 2003-08-06 09:48:26.560 ------------------------------------------------------ 2003-08-06 09:52:49.230
-- 3:50 minutes-- fast forward: ------------------------------------------------------ 2003-08-06 09:56:16.140 ------------------------------------------------------ 2003-08-06 10:00:05.780
Set Wise
Just as much work for SQL Server to crunch 1 row as 10K rows
select getdate()go
update table_xset ROW_ID = '0'go
select getdate()go-- 26 seconds ------------------------------------------------------ 2003-08-06 09:44:14.263
(943093 row(s) affected) ------------------------------------------------------ 2003-08-06 09:44:40.030
Hints & EIM
/*UPDATEBT SETBT.END_DT = IT.AS_END_DT,BT.NAME = IT.AS_NAME,BT.START_DT = IT.AS_START_DT,BT.X_CHANGE_CODE = IT.X_CHANGE_CODE,BT.X_CHANGE_DATE = IT.X_CHANGE_DATE,BT.X_CHANGE_TYPE = IT.X_CHANGE_TYPE,BT.X_POLICY_TYPE = IT.X_POLICY_TYPE,BT.X_PREMIUM = IT.X_PREMIUM,BT.X_PRINTED_FLG = IT.X_PRINTED_FLG,BT.X_PRODUCT_DESC = IT.X_PRODUCT_DESC,BT.X_PRODUCT_TYPE = IT.X_PRODUCT_TYPE,BT.X_RATE_PLAN_CD = IT.X_RATE_PLAN_CD,BT.X_SOURCE_SYSTEM = IT.X_SOURCE_SYSTEM,BT.LAST_UPD = @P1,BT.LAST_UPD_BY = @P2,BT.MODIFICATION_NUM = BT.MODIFICATION_NUM + 1 FROMdbo.S_ASSET BT (INDEX = S_ASSET_P1),dbo.S_ASSET_IF IT (INDEX = S_ASSET_IF_M1) WHERE (BT.ROW_ID = IT.T_ASSET__RID ANDIT.IF_ROW_BATCH_NUM = 10410001 ANDIT.IF_ROW_STAT_NUM = 0 ANDIT.T_ASSET__EXS = 'Y' ANDIT.T_ASSET__UNQ = 'Y' ANDIT.T_ASSET__DUP = 'N' ANDIT.T_ASSET__STA = 0)*//*WITH HINTS:Table 'S_ASSET'. Scan count 1273, logical reads 4038, physical reads 0, read-ahead reads 0.Table 'S_ASSET_IF'. Scan count 1, logical reads 5875, physical reads 0, read-ahead reads 0.WITHOUT HINTS:Table 'S_ASSET'. Scan count 1273, logical reads 4038, physical reads 0, read-ahead reads 0.Table 'S_ASSET_IF'. Scan count 1, logical reads 1774, physical reads 0, read-ahead reads 0.*/
Hints (II)
WITH HINT:Table 'S_CONTACT'. Scan count 1142, logical reads 8008, physical reads 0, read-ahead reads 0.Table 'S_CONTACT_IF'. Scan count 1, logical reads 3162, physical reads 0, read-ahead reads 0.Without Hint:)Table 'S_CONTACT'. Scan count 1142, logical reads 8008, physical reads 0, read-ahead reads 0.Table 'S_CONTACT_IF'. Scan count 1, logical reads 231, physical reads 0, read-ahead reads 0.
WITH HINT:Table 'S_APPLD_CVRG'. Scan count 1, logical reads 394774, physical reads 0, read-ahead reads 280810.Table 'S_ASSET5_FN_IF'. Scan count 1, logical reads 366, physical reads 0, read-ahead reads 0.WITHOUT HINT:Table 'S_APPLD_CVRG'. Scan count 1268, logical reads 10203, physical reads 697, read-ahead reads 0.Table 'S_ASSET5_FN_IF'. Scan count 1, logical reads 366, physical reads 0, read-ahead reads 0.
IFB File (I)
IFB File (II)
[Siebel Interface Manager] USER NAME = "SADMIN" PASSWORD = "SADMIN" PROCESS = Import ISS
; FIXES USE INDEX HINTS = FALSE USE ESSENTIAL HINTS = FALSE
;; This group of processes provides samples for import data through; all the interface tables, broken up into logical groups. Note; that the order of import is often significant.
[IMPORT ISS]TYPE = IMPORTTABLE = EIM_ISSBATCH = 1ONLY BASE TABLES = S_ISSONLY BASE COLUMNS = S_ISS.NAME,S_ISS.LANG_ID,S_ISS.BU_ID
IFB File (III)
2003-10-20 17:52:45UPDATEdbo.EIM_ISS SETT_ISS__RID = '1-239-' + CONVERT(varchar, MS_IDENT - 1),T_ISS__UNQ = 'Y' FROMdbo.EIM_ISS T1 WHERE (T_ISS__EXS = 'N' ANDROW_ID =(SELECTMIN(ROW_ID) FROMdbo.EIM_ISS T2 (INDEX (EIM_ISS_T05)) WHERE ((T2.ISS_LANG_ID = T1.ISS_LANG_ID OR (ISS_LANG_ID IS NULL AND T1.ISS_LANG_ID IS NULL)) ANDT2.T_ISS_BU_ID = T1.T_ISS_BU_ID ANDT2.ISS_NAME = T1.ISS_NAME ANDT2.IF_ROW_BATCH_NUM = 1 ANDT2.IF_ROW_STAT_NUM = 0 )) ANDIF_ROW_BATCH_NUM = 1 ANDIF_ROW_STAT_NUM = 0 ANDT_ISS__STA = 0) GenericLogGenericError12003-10-20 17:52:45[Microsoft][ODBC SQL Server Driver][SQL Server]Index 'EIM_ISS_T05' on table 'dbo.EIM_ISS' (specified in the FROM clause) does not exist. GenericLogGenericError12003-10-20 17:52:45(compmain.cpp 16(1555) err=100101 sys=0) SBL-EIM-00101: Invalid arguments to function. GenericLogGenericError12003-10-20 17:52:45(smisched.cpp 17(821) err=100101 sys=0) SBL-EIM-00101: Invalid arguments to function.
Recovery Models in EIM
Run a full backup frequently during the day.Weigh the issues of recovery vs. lost data.Note: Switching from FULL to SIMPLE will break the log chain and have recovery consequences. Always make a full backup after switching to SIMPLE.
One Strategy
Drop all indexesLet the Index Tuning Wizard tell you what you need.On both the EIM_* and Base TablesUse SQL Server Profiler to find scans (ex. > 10000 Reads). Add those indexes back on.Use SP_INDEXOPTION and enable row level ONLY locking (disable page level locking). This facilitates scaling through multiple parallel jobs.Explain how IF_BATCH_NUM impacts parallel jobs.Know your data!
Reality (I)
Out of Box
sp_helpindex EIM_CONTACT3
EIM_CONTACT3_T01EIM_CONTACT3_T02EIM_CONTACT3_T03EIM_CONTACT3_T04EIM_CONTACT3_T05EIM_CONTACT3_T06EIM_CONTACT3_T07EIM_CONTACT3_T08EIM_CONTACT3_T09EIM_CONTACT3_T10EIM_CONTACT3_T11EIM_CONTACT3_T12EIM_CONTACT3_T13EIM_CONTACT3_T14EIM_CONTACT3_T15EIM_CONTACT3_T16EIM_CONTACT3_T17EIM_CONTACT3_T18EIM_CONTACT3_T19EIM_CONTACT3_T20EIM_CONTACT3_U1
A Large Customer
sp_helpindex EIM_CONTACT3
EIM_CONTACT3_T01 EIM_CONTACT3_U1
Reality (II)
S_CONTACT_EIS_CONTACT_F10S_CONTACT_F11S_CONTACT_F12S_CONTACT_F13S_CONTACT_F15S_CONTACT_F2S_CONTACT_F3S_CONTACT_F4S_CONTACT_F5S_CONTACT_F6S_CONTACT_F7S_CONTACT_F8S_CONTACT_IIS_CONTACT_M1S_CONTACT_M11S_CONTACT_M12S_CONTACT_M13S_CONTACT_M14S_CONTACT_M15S_CONTACT_M16S_CONTACT_M17S_CONTACT_M18S_CONTACT_M19S_CONTACT_M2S_CONTACT_M20S_CONTACT_M21S_CONTACT_M22S_CONTACT_M3S_CONTACT_M4S_CONTACT_M6S_CONTACT_M8S_CONTACT_M9S_CONTACT_P1S_CONTACT_U1S_CONTACT_U2S_CONTACT_V1S_CONTACT_V2S_CONTACT_V3S_CONTACT_V5
S_CONTACT_EI S_CONTACT_F6_X S_CONTACT_II S_CONTACT_M1 S_CONTACT_M50 S_CONTACT_M8 S_CONTACT_ML1_X S_CONTACT_ML2_X S_CONTACT_ML3_X S_CONTACT_ML4_X S_CONTACT_ML5_X S_CONTACT_ML6_X S_CONTACT_P1 S_CONTACT_PREM01_X S_CONTACT_PREM02_X S_CONTACT_U1 S_CONTACT_U2 S_CONTACT_V3
Indexes in RED were custom.
Summary (I)
Optimize your EIMBatch SizeHint Removal: Siebel and SQL ServerTurn off docking replicationGet rid of workflow triggersOnly load up the tables needed from the Siebel meta data. Loading the whole catalog can represent 25% the time of your whole batch run.Run batches in parallelExclude validation of non-used data. If you know something is never NULL (ex. EmpID), then dont check for it.
Summary (II)
Update StatsStale stats can cause problemsInvestigate turning autostats offDefrag between large runsDefrag both EIM_ and base tablesOn large initial loads, put fill factor and pad index to 50%. Cut down on page splits. Default is 100% full.Use minimally logged operations to load and scrub.Bulk Insert, SELECT/INTORecovery ModelsRun all data loading locallyScrub data inside SQL Server. No cursors.Make the right indexesTry monster clustered index onlyGet rid of unused indexesAdd them back after runsWork with ES to resolve support issues.
Appendix (I)
The un-abridged talk 52 slides
Siebel Data Loading Best Practices on SQL Server
Frank Earl [email protected]
How Many of have used SQL Server and EIM?
Objectives
What is EIM?What makes SQL Server different than other platforms?What can we optimize?How do we optimize?What tools can I use?What techniques can I try?
What is EIM?
EIM is the process that Siebel uses to load data into the database.EIM is used on every platform.It validates the data (PK/FK relationships) and generates unique Siebel ROWIDs.Every customer uses it. It tends to be the first problem that every customer hits with Siebel.You cannot bypass EIM.No BCP or BULK INSERT into Siebel Base Tables
Symptoms of EIM?
The first several thousand will always go fast.Performance deteriorates over time in a logarithmic pattern.Why? Because the b-trees grow and more levels to traverse.The curve grows logarithmically, *NOT* arithmetically. Ex. The first 5K rows go in at a rate of 2 minutes. After 2 weeks of loading that same 5K rows takes 1 hour.
Chart1
22
44.1
66.305
88.62025
1011.0512625
1213.603825625
1416.2840169062
1619.0982177516
1822.0531286391
2025.1557850711
2228.4135743247
2431.8342530409
2635.4259656929
2839.1972639776
3043.1571271765
3247.3149835353
3451.680732712
3656.2647693476
3861.078007815
4066.1319082058
4271.4385036161
4477.0104287969
4682.8609502367
Rows
Minutes
Sheet1
25000RowsMinutesMinutes
500002020500022
100000401000044.1
150000601500066.305
200000802000088.62025
250000100250001011.0512625
300000120300001213.603825625
350000140350001416.2840169062
400000160400001619.0982177516
450000180450001822.0531286391
500000200500002025.1557850711
550000220550002228.4135743247
600000240600002431.8342530409
650000260650002635.4259656929
700000280700002839.1972639776
750000300750003043.1571271765
800000320800003247.3149835353
850000340850003451.680732712
900000360900003656.2647693476
950000380950003861.078007815
10000004001000004066.1319082058
10500004201050004271.4385036161
11000004401100004477.0104287969
11500004601150004682.8609502367
Sheet1
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Rows
Minutes
Sheet2
Sheet3
What makes EIM query different? (vs. a normal Siebel query)
EIM is batch oriented vs. normal usage is OLTP.EIM has better logging integrated into it.Typically more complicated: EIM inserts, updates, selects, and deletes.Reads system catalog on each run.You can alter the indexes on the EIM tables.Without ES approval you cannot alter Siebel base tables.EIM is typically less configurable than a business rule. Tune the IFB file, but cant control the query that well.
Common Siebel Performance Problems
Docking replication turned onToo many indexesWrong indexesClustered index issuesTable spools (non-EIM)Implicit cursor conversion (non-EIM)Too much meta data being loadedBatch size not optimumBlocking/Deadlocks while loading in parallelStats are stale (EIM only)Maybe hints in the Siebel EIM job
The pre-EIM Loading Issue
Siebel provides EIM, but no mechanism to populate the EIM_ tables. Every customer has to invent the wheel.A method to populate the EIM tablesData scrubbingLog checkingArchival of dataParallelism/Job SchedulingSome customers do it good others not.Dont distribute over networkDont validate/scrub while loadingDont embed VB script to validate/scrub data while it is being loaded. Use database stored procedures. Efficiencies with caching.What the Alliance is working onA better wheel.
What are the typical problems with EIM? (I)
Too many indexesBoth on EIM_* tables andBase tablesExample, 100% NULL indexes.Incorrect indexesPoor selectivity
What are the typical problems with EIM? (II)
You are allowed to modify/add/drop to the EIM_* tablesIMPORTANT: You are not allowed to change base table indexes without ES approval. Build a case and present it to them.They use expert mode for Siebel Tools and alter the meta data.Poor statistics & Bad plansUpdate statisticsParallelismsp_configureMAXDOP to 1
How To Look At The Problem
From the down from the bottom up. Use all resources.From SiebelUse EIM logsShow you how long the SQL statement took to processFactors in network performanceFrom SQL ServerUse ProfilerShows you IO, execution plan, duration, etcFrom NTPerfmon will show you many SQL Server counters, cache statistics, CPU utilization, etc
Tools to Use (I)
SQL Server ProfilerGives you a SQL Server kernel view of the query. Level 8 EIM TraceWill show you network times, etcWill not always show you hintsWill not show you plans
Tools to Use (II)
Index WizardWill never suggest a better index. Why? Because Siebel indexes just about every possible configuration!What it will show you is what indexes ARE being used. Hence, you can deduce what indexes ARE NOT being used.DBCC SHOWSTATISTICSWill show you what indexes are 100% of one value.Probably not being used.
Tools to Use (III)
ISQLWUse ODBC wrapperNeed to use proper wrapper Will get incorrect plan if you do notSET STATISTICS PROFILEWill show plans with costingActual, not estimateSET STATISTICS IOShows the IO onlyOnly use one of these counters at a time.Ex. Only STAT IO or STAT PROFILEDBCC DBREINDEXUPDATE STATISTICSGraphical Show Plan (Estimate)
SQL Profiler
Powerful tool, but can burn up a lot of CPU if you are not careful. Not a Black Box flight recorder. Use it sparingly.What happens if you capture everything? 8 CPUs at 100%, events will be dropped, etcFilter on the SPID or CLIENT PROCESSCapture on the following Events: Stored Procedures -> RPC:Completed and TSQL -> SQL:BatchCompletedCapture Binary Data columnSave off to a file, then to a table and analyzeA good usage: Filter on all queries taking > 30 seconds or high IO (reads/writes). Measure over 24 hours. Those tend to be your problems.
NT perfmon.exe
Shows you performance counters for SQL Server as related to the base operating system.For example: Cache hit ratesPage splits on the disk.Transaction log throughputDisk Queuing
How To Optimize EIM (I):
Turn on Level 8 EIM logging.Very verboseTurn on SQL Server Profiler for the run.Run an EIM batch.Get a base line to measure progress.
How to Optimize EIM (II):
Look at the EIM log. What is the longest execution time?Tip: Load this log into Excel and sort. Then search the original log.Look at the SQL Profiler trace. Sort on longest duration and then on reads. These queries should match those in the EIM log. Look at the execution plans. Why is it taking a long time?Table spool?Book mark look ups?Covered Index fix it?Use Excel to help sort.Once the problem has been isolated, figure out the fix.
Hints (I) SQL Server:
Siebel uses hints by default. This helps rule based optimizers, but hurts cost based ones.Big performance gains can be made by just taking the hints out.Why you just cant drop the index: the query will not compileIf you still are having problems with hints, contact PSS and they can help you with advanced diagnostics.
Hints (II) Siebel:
From the IFB file configurationTest with / without Index HintsUSEINDEXHINTS, FALSEUSEESSENTIALHINTS, FALSE
More IFB File Configuration
Consider using the parameters as appropriate:ONLY BASE TABLES, IGNORE BASE TABLES, ONLY BASE COLUMNS, IGNORE BASE COLUMNS
This will reduce the amount of processing performed by EIM. ONLY and IGNORE are mutually exclusive.
Index Tuning Wizard
Tends not to help much because Siebel has just about every index on there already.Will not suggest a clustered index.Just takes a long time to run but it can help you determine which indexes to drop.By showing you what indexes are used, you can infer which ones are not.
Which Indexes Are Used?
Look at the Profiler trace from the EIM run.Within the trace, look at the execution plan. Will tell you which indexes used.Remember, you have to capture with the following:Event: Performance -> Show Plan StatisticsData Columns: BinaryDataOutput Example:Clustered Index Scan(OBJECT:([master].[dbo].[sysobjects].[sysobjects]))
Index Removal (I)
Why remove indexes?Penalties on INSERTS/DELETES which is what the bulk of EIMMany indexes are not usedOnly keep the EIM indexes that are used. Fewer indexes mean that there are fewer choices and fewer chances of a wrong plan being generated.Less information to the optimizer can be good sometimes.
Index Removal (II)
Feel free to remove indexes on the EIM_* tablesWork with Siebel ES to remove un-needed indexes on the base tables.Richard Whats the Siebel official position on dropping indexes during data loads on the base tables providing they add them back after the load?Build a case. For example, show that a column is 100% NULL, etcOptimizer probably will never use single value indexes.Only in a COUNT on that column would it use it.
Index Removal (III)
DBCC SHOWSTATISTICSLook at Density and Steps
Does the Table have a Clustered Index?
Some tables dontRun the script in the notes to find which ones dontCompare the results to your mappings and ER diagramAsk Siebel why they dont have a clustered index
Clearly Think Out Your Clustered Index
Smart design can help in covering the indexNon clustered indexes are built on top of clustered indexesPoint to the cluster keyEvery Siebel query has ROWID in it, thats why its the clustered index on all base tables.Look at DBCC SHOW STATISTICS
Index Strategies (I)
A successful strategy used has been:Analyze what columns are being usedPut them in the clustered indexUse BOE method for selectivityDrop all non-clustered indexes from the EIM table.Keep just the unique indexes on the Siebel base tablesThe premise is that EIM is operating on every row in the batch anyways, so why not just scan them all.Is it worth the extra overhead of more indexes for what amounts to be a scan?
Index Strategies (II)
This strategy works because the BATCH_NUM firewalls the scan from a full table scan.The BATCH_NUM is the first column in the clustered index. Thus, bounding the scan.
Performance Testing
After query is isolatedMake a baselineUse ISQLWUse SET STATISTICS IO ONUse SET STATISTICS PROFILE ON(One at a time. Turn off between.)DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE Change/Add index, rerun the query with IO and PROFILE.The McBaths Oil Argument
McBaths Oil Argument
Do you notice when you change the oil in your car? Your engine does.Looking at something that is massively iterated on and saving 1 IO. Helps throughput and scalability because the server is accessing data more efficiently, both on disk, cache and cpu.Reindex, fill factorResequencing columns in an indexCheck IO with SET STATISTICS IO
Fragmentation Happens
During big data loadsRun DBCC REINDEX to correctThink about: Fill FactorPad IndexCheck with Perfmon:SQL Server: Access Methods -> Page Splits / secTrend over timeLook at sample script to defragDefrag & update stats between runsDBCC INDEXDEFRAG, REINDEX, and drop recreateSamples and run timesPros and consPossibly use file groups for initial load. For example, put the EIM tables in their own FG or put the indexes on their own FG.
Stale Statistics (I)
What happens when stats get oldBad plans. A query that normally runs in a few seconds can take 30 minutes.How do the stats get stale?EIM updates every row in the EIM_* table. The process that auto updates stats doesnt wake up in time between runs.Small tables will never be updated.
Stale Statistics (II)
Correct this by running UPDATE STATISTICS between runs or a SQL AGENT job that wakes up and runs.Consider turning off auto update stats for the data load.Its all about getting predictable performance.100 runs 97 run in 5 minutes 3 run in 4 hours each at midnight when no one is around to kill it.
Multiple EIM Batches
Number of batches directly related to your clustered index designGood index will keep deadlocks from happeningYou can run multiple batches against the same _IF table!EIM Partioned tables across multiple SQL ServersUse SP_INDEXOPTION to enable row level locks only. Cuts down on blocking issues.But will use up more memory for the locks
Efficiently Loading Data (I)
Load into pre-staging tablesScrub in tempdbMinimal loggingRecovery model for EIM during the load for the siebeldb.ALTER DATABASE siebeldb SET RECOVERY SIMPLEImplications. Be very aware of them. Read BOL.Scrub in SQL Server:Efficiencies of SQL Server caching, memory management, cpu usage, and a big database backend server.Use set wise processing, not cursorsIf have to use cursors, use Fast Forward/Read OnlyPL/SQL Consider use of NOLOCK for better performance. Dirty Reads. Works well if you are the only one on the server, etcExample
ISAM Cursor Performance
The following is based on a 9.5 million row table. 2 CPU. 2G RAM.
DECLARE scrub_cursor CURSOR FORSELECT yFROM table_xorder by y
OPEN scrub_cursor
FETCH NEXT FROM scrub_cursor INTO @yWHILE @@FETCH_STATUS = 0BEGIN then update table_x
Performance Results With Cursors
DECLARE scrub_cursor CURSOR FORSELECT yFROM table_x order by y
-- 4:23 minutes-- normal: ------------------------------------------------------ 2003-08-06 09:48:26.560 ------------------------------------------------------ 2003-08-06 09:52:49.230
-- 3:50 minutes-- fast forward: ------------------------------------------------------ 2003-08-06 09:56:16.140 ------------------------------------------------------ 2003-08-06 10:00:05.780
Or Set Wise Processing
Just as much work for SQL Server to crunch 1 row as 10K rows
select getdate()go
update table_xset ROW_ID = '0'go
select getdate()go-- 26 seconds ------------------------------------------------------ 2003-08-06 09:44:14.263
(943093 row(s) affected) ------------------------------------------------------ 2003-08-06 09:44:40.030
Efficiently Loading Data (II)
BULK INSERT vs BCPBULK INSERT is in memoryBCP is more configurableBoth are single threadedOnly run on one CPURun multiple BULK INSERTS at once across multiple CPUs. If the order of the data is not a concern, or you'd rather take the hit when creating the index, it's best to run BULK INSERT into the EIM tables in parallel by deploying a separate thread per CPU. You can use the TABLOCK hint to allow multiple non-blocking inserts.
Efficiently Loading Data (III)
Rules of thumb for previous: Use only 2 - 3 threads at max (only if you have the processors) Limit the batch Commit Size (batch size) to about 2,000 rows per batch.Adjust up or down based on your testing. Remember, if loading in clustered index sequence, only use one thread. Bulk operations are very high performance. They do log.Conditions in BOL (ex. TABLOCK)Try and load the data pre sortedRun all local on the database server. Not distributed over the network.
Efficiently Loading Data (IV)
Disk SlicesEven with a SAN, break out the following on separate slices if possible:EIM_* TablesTEMPDB Base (DATA) TablesINDEXESDo this by dropping the table and recreating on a different file group or modify the install script and configuration files.Cuts down on fragmentation and contention especially for data loading. For example, make put the EIM tables/indexes on separate FG from the base tables. Possibly for just the initial load.The maint issues associated with extra FGs.
Efficiently Loading Data (V)
RAID and EIMDue to the constant UPDATE, INSERT and DELETE, try and use RAID 0+1 if possible. Parity bit calculation penalty can be significant.Size the Siebel database (siebeldb or whatever it is named in production) appropriately, and ensure that it will not have to autogrow during the process; that will hurt disk I/O and performance.
Efficiently Loading Data (VI)
When running EIM itself, run processes in parallel. Set different batch numbers, but they can be executed against the same interface tables. Try and run from opposite ends of the batch range. Can help cut down on the blocking.Ex. Run 1 & 5 at the same time, not 1 & 2.Test to see how many threads can be run on your system. Start with two and add as appropriate.If you are blocking and getting lock escalations, use sp_indexoption and set the clustered index to no page locks. See BOL for more information.
Efficiently Loading Data (VII)
Disable any triggers on the databases (such as workflow triggers) and then re-apply/enable them after the process is done. If this is not for an initial data load, this means that Workflow Manager or Assignment Manager will not function during the load for the new or updated data.
Efficiently Loading Data (VIII)
Load multiple base tables from one interface table. In the IFB table, set the parameter USING SYNONYMS to FALSE only if you are not associating multiple addresses with accounts. If you have a 1:1 ratio, you are telling the EIM process that account synonyms do not require processing during the import, reducing the amount of work EIM needs to do (as well as load on the system).
Recovery Models During EIM
Use SIMPLE or BULK LOGGED if possible.Run a full backup freqeuntly during the day.Weigh the issues of recovery vs. lost data.Note: Switching from FULL to SIMPLE will break the log chaing and have recovery consequeneces. Always make a full backup after switching to SIMPLE.
Bringing It All Together (I)
Optimize your EIMBatch SizeHint Removal: Siebel and SQL ServerTurn off docking replicationGet rid of workflow triggersOnly load up the tables needed from the Siebel meta data. Loading the whole catalog can represent 25% the time of your whole batch run.Run batches in parallelExclude validation of non-used data. If you know something is never NULL (ex. EmpID), then dont check for it.
Bringing It All Together (II)
Update StatsStale stats can cause problemsInvestigate turning autostats offDefrag between large runsDefrag both EIM_ and base tablesOn large initial loads, put fill factor and pad index to 50%. Cut down on page splits. Default is 100% full.Use minimally logged operations to load and scrub.Bulk Insert, SELECT/INTORecovery ModelsRun all data loading locallyScrub data inside SQL Server. No cursors.Make the right indexesTry monster clustered index onlyGet rid of unused indexesAdd them back after runsWork with ES to resolve support issues.
Bringing It Together (III)
Slice your disk rightMore spindles = More performance. Dont believe the vendor when they say cache will solve your problems. It helps hide them.No RAID 5 if possibleSeparate Data, Log, Indexes
Questions?
Frank McBathAmericas Technical SpecialistSiebel Microsoft Global Alliance [email protected]
select datediff(mi, last_batch,getdate())'minutes',spid, waittype, cpu, physical_io, convert(char(15),hostname), convert(char(15),program_name), convert(char(20),getdate()),spid, last_batch, cmdfrom master..sysprocesseswhere spid > 50 andcmd not like '%WAIT%' anddatediff(mi, last_batch,getdate()) > 1order by last_batch
/*SQL User NameCPUReadsWritesDurationConnection IDSPIDStart TimeSADMIN62538959337016453516980918:42:16.327*//*UPDATEBT SETBT.END_DT = IT.AS_END_DT,BT.NAME = IT.AS_NAME,BT.START_DT = IT.AS_START_DT,BT.X_CHANGE_CODE = IT.X_CHANGE_CODE,BT.X_CHANGE_DATE = IT.X_CHANGE_DATE,BT.X_CHANGE_TYPE = IT.X_CHANGE_TYPE,BT.X_POLICY_TYPE = IT.X_POLICY_TYPE,BT.X_PREMIUM = IT.X_PREMIUM,BT.X_PRINTED_FLG = IT.X_PRINTED_FLG,BT.X_PRODUCT_DESC = IT.X_PRODUCT_DESC,BT.X_PRODUCT_TYPE = IT.X_PRODUCT_TYPE,BT.X_RATE_PLAN_CD = IT.X_RATE_PLAN_CD,BT.X_SOURCE_SYSTEM = IT.X_SOURCE_SYSTEM,BT.LAST_UPD = @P1,BT.LAST_UPD_BY = @P2,BT.MODIFICATION_NUM = BT.MODIFICATION_NUM + 1 FROMdbo.S_ASSET BT (INDEX = S_ASSET_P1),dbo.S_ASSET_IF IT (INDEX = S_ASSET_IF_M1) WHERE (BT.ROW_ID = IT.T_ASSET__RID ANDIT.IF_ROW_BATCH_NUM = 10410001 ANDIT.IF_ROW_STAT_NUM = 0 ANDIT.T_ASSET__EXS = 'Y' ANDIT.T_ASSET__UNQ = 'Y' ANDIT.T_ASSET__DUP = 'N' ANDIT.T_ASSET__STA = 0)*/SET STATISTICS IO ONGOSELECT COUNT(*) FROMdbo.S_ASSET BT (INDEX = S_ASSET_P1),dbo.S_ASSET_IF IT (INDEX = S_ASSET_IF_M1) WHERE (BT.ROW_ID = IT.T_ASSET__RID ANDIT.IF_ROW_BATCH_NUM = 10410001 ANDIT.IF_ROW_STAT_NUM = 0 ANDIT.T_ASSET__EXS = 'Y' ANDIT.T_ASSET__UNQ = 'Y' ANDIT.T_ASSET__DUP = 'N' ANDIT.T_ASSET__STA = 0)/*WITH HINTS:Table 'S_ASSET'. Scan count 1273, logical reads 4038, physical reads 0, read-ahead reads 0.Table 'S_ASSET_IF'. Scan count 1, logical reads 5875, physical reads 0, read-ahead reads 0.WITHOUT HINTS:Table 'S_ASSET'. Scan count 1273, logical reads 4038, physical reads 0, read-ahead reads 0.Table 'S_ASSET_IF'. Scan count 1, logical reads 1774, physical reads 0, read-ahead reads 0.*/
-- DBCC SHOW_STATISTICS (authors, UPKCL_auidind)-- GOdbcc show_statistics(s_fn_need, 2)-- make the statement to show distribution...select 'dbcc show_statistics ('+object_name(id)+','+name+')'+char(13)+'go'+char(13)from sysindexes siwhere object_name(si.id) like 'S_%' and rows > 2000-- order by object_name(si.id)order by rowsgo
drop table #all_idxgo-- make a temp table with all siebel tables and indexesSELECT so.name, si.indid-- count(*) into #all_idx from sysobjects so, sysindexes siwhere so.name like 'S_%' and so.id = si.id andso.type = 'U' -- andgo-- si.indid = 1--867 tables with clustered indexselect count(*)from sysobjectswhere name like 'S_%'andtype = 'U'-- 1172 tablesselect 1172 - 867-- 305 tables DO NOT have clustered indexes-- whack every thing out of the temp table that is not a clustered indexdelete from #all_idx where indid < 1 delete from #all_idx where indid > 1-- this query finds all indexes that do NOT have a clustered indexselect * from ((#all_idx as a right outer join sysobjects as soon-- outer join and match ones that DO and DO NOT match into one result seta.name=so.name))where -- just give me tablesso.type = 'U' and-- just pull out the siebel tablesso.name like 'S_%' and-- just show me the ones without a clustered index. it "is null"-- because it does not exist in the temp table.a.name is NULLorder by-- sort by nameso.name
Appendix (II)Configuration Specs Real World
sp_configure go name minimum maximum config_value run_value ----------------------------------- ----------- ----------- ------------ ----------- affinity mask -2147483648 2147483647 0 0 affinity64 mask -2147483648 2147483647 0 0 allow updates 0 1 0 0 awe enabled 0 1 1 1 c2 audit mode 0 1 0 0 cost threshold for parallelism 0 32767 5 5 Cross DB Ownership Chaining 0 1 0 0 cursor threshold -1 2147483647 -1 -1 default full-text language 0 2147483647 1033 1033 default language 0 9999 0 0 fill factor (%) 0 100 0 0 index create memory (KB) 704 2147483647 0 0 lightweight pooling 0 1 0 0 locks 5000 2147483647 0 0 max degree of parallelism 0 32 0 0 max server memory (MB) 4 2147483647 2147483647 2147483647 max text repl size (B) 0 2147483647 65536 65536 max worker threads 32 32767 255 255 media retention 0 365 0 0 min memory per query (KB) 512 2147483647 1024 1024 min server memory (MB) 0 2147483647 0 0 nested triggers 0 1 1 1 network packet size (B) 512 65536 4096 4096 open objects 0 2147483647 0 0 priority boost 0 1 0 0 query governor cost limit 0 2147483647 0 0 query wait (s) -1 2147483647 -1 -1 recovery interval (min) 0 32767 0 0 remote access 0 1 1 1 remote login timeout (s) 0 2147483647 20 20 remote proc trans 0 1 0 0 remote query timeout (s) 0 2147483647 600 600 scan for startup procs 0 1 1 1 set working set size 0 1 0 0 show advanced options 0 1 1 1 two digit year cutoff 1753 9999 2049 2049 user connections 0 32767 0 0 user options 0 32767 0 0
exec master..xp_msver Go
Index Name Internal_Value Character_Value ------ -------------------------------- -------------- ---------------------------------1 ProductName NULL