Upload
jenski
View
34
Download
0
Embed Size (px)
DESCRIPTION
A Multithreading C# Data Synchronization Program and Its Realization. Course: ECE 1747H Parallel Programming Professor: Christiana Amza Student / Presenter: Bin Li Dec. 12, 2006 @ University of Toronto. Agenda. Background Problem & Solution Parallel Implementation - PowerPoint PPT Presentation
Citation preview
A Multithreading C# A Multithreading C# Data Synchronization Program Data Synchronization Program
and Its Realizationand Its Realization
Course: ECE 1747H Parallel ProgrammingCourse: ECE 1747H Parallel ProgrammingProfessor: Christiana AmzaProfessor: Christiana AmzaStudent / Presenter: Bin LiStudent / Presenter: Bin Li
Dec. 12, 2006 @ University of TorontoDec. 12, 2006 @ University of Toronto
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto2
AgendaAgenda
BackgroundBackground
Problem & SolutionProblem & Solution
Parallel ImplementationParallel Implementation
Performance MeasuringPerformance Measuring
Other ApproachesOther Approaches
Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto3
Background Background (Company & Project)(Company & Project)
““Retail Value Canada Inc.”Retail Value Canada Inc.”Markham-based Specialty RetailerMarkham-based Specialty Retailer
384 stores in Canada & USA, 30K types of 384 stores in Canada & USA, 30K types of itemsitems
Head Office-side information maintained in Head Office-side information maintained in WindowsWindows
Store-side information maintained in UnixStore-side information maintained in Unix
Data synchronization is neededData synchronization is needed
Data type: product code, status, cost, price, Data type: product code, status, cost, price, promo, deal, subsidy, vendor, warehouse, promo, deal, subsidy, vendor, warehouse, etc. (by item, by store)etc. (by item, by store)
Current application: Current application: iSynciSync (developed in (developed in 2000 in Visual C# 1.0)2000 in Visual C# 1.0)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto4
Background Background (System Architecture)(System Architecture)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto5
AgendaAgenda
BackgroundBackground Problem & SolutionProblem & Solution
Parallel ImplementationParallel Implementation
Performance MeasuringPerformance Measuring
Other ApproachesOther Approaches
Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto6
Problem & SolutionProblem & SolutionScheduled Data Synchronization (Scheduled Data Synchronization (iSynciSync process) starts at 10 pm, and ends at process) starts at 10 pm, and ends at 12 am12 am
iSynciSync extracts and transforms data from extracts and transforms data from Windows into ASCII file (.dat), and sends Windows into ASCII file (.dat), and sends it to Unixit to Unix
Mass data modification takes Mass data modification takes iSynciSync quite a long time (4-5 hours) to run, quite a long time (4-5 hours) to run, which is over 2-hour schedule limit which is over 2-hour schedule limit
The latest change (i.e. prices) in head The latest change (i.e. prices) in head office cannot reach stores before the office cannot reach stores before the opening hour of the next business dayopening hour of the next business day
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto7
Problem & Solution Problem & Solution (cont’d)(cont’d)
The store-side information delay causes The store-side information delay causes inaccurate sales information in retail inaccurate sales information in retail storesstores
Bottleneck: Bottleneck: iSync iSync (only 10% CPU usage (only 10% CPU usage on a 4-CPU database server)on a 4-CPU database server)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto8
Problem & Solution Problem & Solution (cont’d)(cont’d)
Sequential programSequential programiSynciSync generates .dat file generates .dat file by each storeby each store, , which is slowwhich is slow
Parallel solutionParallel solutionImplementing Implementing qSyncqSync to replace to replace iSynciSync (using (using Microsoft C# multithreading)Microsoft C# multithreading)
Parallelly generating .dat file Parallelly generating .dat file by store by store groupsgroups
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto9
AgendaAgenda
BackgroundBackground Problem & SolutionProblem & Solution Parallel ImplementationParallel Implementation
Performance MeasuringPerformance Measuring
Other ApproachesOther Approaches
Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto10
Parallel ImplementationParallel ImplementationDevelopment EnvironmentDevelopment Environment
Design/UML Tool: Microsoft Visio 2003Design/UML Tool: Microsoft Visio 2003
Development Tool: Microsoft Visual Development Tool: Microsoft Visual Studio .NET 2003Studio .NET 2003Programming Language: Visual C# 2.0 Programming Language: Visual C# 2.0 ((multithreading similar to Linux PThreads)multithreading similar to Linux PThreads)
Parallelization StepsParallelization StepsStore Data SegmentationStore Data Segmentation
Parallel Data ProcessingParallel Data Processing
Result Data Consolidation Result Data Consolidation
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto11
Parallel Implementation Parallel Implementation (cont’d)(cont’d)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto12
User Interface User Interface (Screen 1)(Screen 1)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto13
User Interface User Interface (Screen 2)(Screen 2)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto14
Sample CodeSample Code
using System.Threading;using System.Threading;
private int getNbrOfInstance()private int getNbrOfInstance(){{ //...//...
string sqlStmt = "select cast(RBSValue as int) from RulesBasedSystem " +string sqlStmt = "select cast(RBSValue as int) from RulesBasedSystem " + "where RBSTxt = 'HISSPNbrOfInst' and RBSScopeKey = 'Retail Value'";"where RBSTxt = 'HISSPNbrOfInst' and RBSScopeKey = 'Retail Value'";
//...//...}}
HISSPCLPSyncComponent clpComponent = null;HISSPCLPSyncComponent clpComponent = null;clpComponent = new HISSPCLPSyncComponent();clpComponent = new HISSPCLPSyncComponent();
ThreadStart threadDelegate=null;ThreadStart threadDelegate=null;Thread threadObj=null;Thread threadObj=null;
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto15
Sample Code Sample Code (cont’d)(cont’d)
for (int i=0; i<dtb.Rows.Count; i++)for (int i=0; i<dtb.Rows.Count; i++){{ //...//...
threadDelegate = new ThreadStart(clpComponent.ExtCLPPrice);threadDelegate = new ThreadStart(clpComponent.ExtCLPPrice);threadObj = new Thread(threadDelegate);threadObj = new Thread(threadDelegate);threadObj.Name = Convert.ToString(i);threadObj.Name = Convert.ToString(i);threadList.Add(threadObj);threadList.Add(threadObj);//Start the thread//Start the threadthreadObj.Start();threadObj.Start();
}}// Join the threads// Join the threadsfor (int i = 0; i<dtb.Rows.Count; i++)for (int i = 0; i<dtb.Rows.Count; i++){{
threadObj = (Thread) threadList[i];threadObj = (Thread) threadList[i];threadObj.Join();threadObj.Join();
}}while(j>0) //Approach #3while(j>0) //Approach #3{{
lock(this)lock(this){{ consolidateCLPPrice(baseStoreId[j], itemBaseId[j], marketZoneId[j], itemPackId[j]);consolidateCLPPrice(baseStoreId[j], itemBaseId[j], marketZoneId[j], itemPackId[j]);}}j--;j--;
}}
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto16
AgendaAgenda
BackgroundBackground Problem & SolutionProblem & Solution Parallel ImplementationParallel Implementation Performance MeasuringPerformance Measuring
Other ApproachesOther Approaches
Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto17
Performance MeasuringPerformance MeasuringTesting EnvironmentTesting Environment
Database ServerDatabase ServerIntel Xeon CPU 2.40 GHz, Intel Xeon CPU 2.40 GHz, 44 CPUs, 3GB RAM CPUs, 3GB RAM
Windows 2000 w/SP4, MS SQL Server 2000Windows 2000 w/SP4, MS SQL Server 2000
Subset of real production dataSubset of real production data
Web/Application ServerWeb/Application ServerIntel Pentium4, Intel Pentium4, 22 CPU 3.40Ghz ( CPU 3.40Ghz (HTHT), 2GB RAM), 2GB RAM
Windows XP w/SP2, IIS 5.1Windows XP w/SP2, IIS 5.1
Performance CountersPerformance CountersCPU % UsageCPU % Usage
Execution TimeExecution Time
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto18
Performance ComparisonPerformance ComparisonNumber of ThreadsNumber of Threads CPU Usage / Execution Time (sec)CPU Usage / Execution Time (sec)
1 (sequential)1 (sequential) 10%10%
2552255222 25%25%
15451545 55 75%75%
5275271010 93%93%
4614612020 100%100%
370370
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto19
Performance Comparison Performance Comparison (cont’d)(cont’d)
CPU % UsageCPU % Usage
Execution TimeExecution Time
0
20
40
60
80
100
1 2 3 5 10 15 20
Number of Threads
Parallel Data Synchronization(CPU % Usage Comparison)
CPU Usage (%)
0
5001000
1500
20002500
3000
1 2 3 5 10 15 20
Number of Threads
Parallel Data Synchronization(Execution Time - seconds)
Execution Time(seconds)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto20
AgendaAgenda
BackgroundBackground Problem & SolutionProblem & Solution Parallel ImplementationParallel Implementation Performance MeasuringPerformance Measuring Other ApproachesOther Approaches
Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto21
Other Approaches Other Approaches (Approach #2)(Approach #2)
““Locking Temp Files”Locking Temp Files”
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto22
Other Approaches Other Approaches (Approach #2 cont’d)(Approach #2 cont’d)
““Locking Temp Files”Locking Temp Files”All threads write to single .dat fileAll threads write to single .dat file
Using lock for file appendingUsing lock for file appending
Result: bad as sequentialResult: bad as sequential
Explanation: same disk file cannot be Explanation: same disk file cannot be shared simultaneously by different threads, shared simultaneously by different threads, needs to close/re-open (different from needs to close/re-open (different from shared memory)shared memory)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto23
Other Approaches Other Approaches (Approach #3)(Approach #3)
““Locking Temp Tables”Locking Temp Tables”
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto24
Other Approaches Other Approaches (Approach #3 cont’d)(Approach #3 cont’d)
““Locking Temp Tables”Locking Temp Tables”All threads share single temporary All threads share single temporary database tabledatabase table
Using lock for table record insertingUsing lock for table record inserting
Result: much better than sequential, not as Result: much better than sequential, not as good as the Main Approachgood as the Main Approach
Explanation: database server has enough Explanation: database server has enough memory; lock brings slight delaymemory; lock brings slight delay
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto25
AgendaAgenda
BackgroundBackground Problem & SolutionProblem & Solution Parallel ImplementationParallel Implementation Performance MeasuringPerformance Measuring Other ApproachesOther Approaches Future WorkFuture Work
Q & AQ & A
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto26
Further WorkFurther Work
Database ParallelismDatabase ParallelismUpgrading SQL Server 2000 to 2005Upgrading SQL Server 2000 to 2005
Migrating C# code of data synchronization Migrating C# code of data synchronization to database stored procedures, optimizing to database stored procedures, optimizing SQL queriesSQL queries
Changing temporary table(s) to permanent Changing temporary table(s) to permanent schemaschema
Using SQL Server Integration Services (SSIS Using SQL Server Integration Services (SSIS 2005) to do parallel data load & 2005) to do parallel data load & transformationtransformation
Accessing permanent table (which contains Accessing permanent table (which contains final data to be synchronized) to final data to be synchronized) to generate .dat filegenerate .dat file
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto27
Q & AQ & A
Thanks!Thanks!
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto28
Additional Slide for Q&A Additional Slide for Q&A (C#)(C#)
C# (pronounced “C Sharp”)C# (pronounced “C Sharp”)Microsoft .NET Framework-compliant Microsoft .NET Framework-compliant languagelanguage
Simple, modern, object oriented Simple, modern, object oriented programming language derived from C and programming language derived from C and C++C++
Aims to combine the high productivity of Aims to combine the high productivity of Visual Basic and the raw power of C++.Visual Basic and the raw power of C++.
C# vs JavaC# vs JavaSimilar but not same in language Similar but not same in language specificationsspecifications
Compilation: C# to Microsoft Intermediate Compilation: C# to Microsoft Intermediate Language (MSIL), and Java to Java bytecode Language (MSIL), and Java to Java bytecode
Running: C# in Common Language Running: C# in Common Language Runtime (CLR), Java in Java Virtual Machine Runtime (CLR), Java in Java Virtual Machine (JVM)(JVM)
Parallel Programming Professor: Christiana Amza Student: Bin Li Dec.12, 2006 @ University of Toronto29
Additional Slide Additional Slide for Q&A for Q&A (Main Approach vs Approach #3)(Main Approach vs Approach #3)
0
200
400
600
800
Number of Threads
10 20
Parallel Data Synchronization(Execution Time - Seconds)
Main Approach (no datalock)
Approach #3 (lockingtemp table)