24
ComparX-R Senthil Sundaresan Version 1 By SenthilMurugan. S [email protected]

Compar x r (row based comparison tool)

Embed Size (px)

DESCRIPTION

This is an Excel Comparison Tool developed with the help of VBA. It compares the data row wise in 2 sheets.

Citation preview

Page 1: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Version 1

By

SenthilMurugan. [email protected]

Page 2: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

This document describes the use of ComparX tool. I hope this will be really helpful to you to completely understand what the tool does.

Source codes are hidden in the tool.

I started creating a simple macro to automate our Comparison process [of Data from Legacy systems and Data generated by Distributed systems] by writing VBA codes. But at one stage I was thinking to make it as a good tool which would be really helpful for other TCSers too.

This tool is really helpful for our team mates during all the phases of our project for COMPARISON as well as for RECONCILIATION processes.

There are some limitations in this tool which I have mentioned later in this document. These limitations will be taken care in the future. And some Future Enhancements also mentioned.

Have a wonderful experience with ComparX.

For feedbacks: [email protected]

Page 3: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

ComparX-R

ComparX-R is a Comparison Tool

It’s created in Excel using VBA. You can compare two sets of data in a sequential manner. Comparison will be done on the basis of Rows By Columns.

25 Source Code Modules are there.

HOME

Page 4: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Application Developed In : EXCEL 2003Language Used : VBAOperability : Windows Operating System

Page 5: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Description

Contents of ComparX-R

Use

WYCWYG

Options

Layouts

Specific Features

Limitations

Difference between ComparX-R and ComparX-C

Future Enhancements

Page 6: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

ComparX – R is a Comparison Tool which is very much helpful for you when you have to compare some huge data.

Normally in Day to day activities Excel is very much a part of our both Official work and personal work.

Here you can put your data in sources sheets and then do whatever options available in Menu with your data.

This tool will generate about 10 Reports which you can use for your analysis.

Ex:

When you are going to compare production data and test data, this tool will be very useful.

When you are going to compare Mainframe data and ETL data [In Migration projects] this is very much useful for you to do analysis as well as for presentation.

Process:

It takes the first row from ETL source data and compares it with first row of Mainframes source data.

Both Mainframes and ETL data should be sorted in the same order based on same number of columns.If a Mainframe has one record in row number 2 then ETL also should have the same row in the same row number. [You may or may not have mismatched columns in that rows] But Key [Columns used for sorting] should be sync between Mainframes and ETL.

Page 7: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Contents of ComparX-R

ComparX-R Contains Several Sheets:

1) Navigation 2) Steps 3) Sample Data 4) Colors List 5) Mainframe 6) ETL 7) Compare Data

Navigation: Home Page of ComparX-R

Steps:

Steps to be followed for copying data into Source files [Mainframe and ETL]

Sample Data:

Sample data to test and have the glance about this tool.

Colors List:

List of Colors and its Numbers to refer while Formatting Reports.

Mainframe:

Input Source file to be compared.

ETL:

Output Source file to compare with Mainframe Data

Page 8: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Compare Data:

Main Menu - here it offers you lot of options like

Four sub menus are there as follows:

Create ReportsFile ActivitiesFormatsNavigator

Page 9: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Create Reports

Report Creation

•Compared Report

•Matched Report

•MisMatched Report

•MisMatched_Cols Report

•Stats Report

•Compared > 2000 Rows Report

•Matched > 2000 Rows Report

•MisMatched > 2000 Rows Report

•MisMatched_Cols > 2000 Rows Report

•Stats > 2000 Rows Report

File Activities

Data Loading

•Sample data loading

•External data loading

Clear Source Contents

Delete reports

Save Reports Alone

Save the file along with reports

Check Record Count of both Sources

Check Generated Report Count and Names

Page 10: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Formats

Formatting Reports

Export the generated reports to

•CSV

•HTML

•TXT file formats

Navigator

•Navigation to Generated reports

When you open the application the above mentioned sheets are default. If you want to hide some sheets you can do this by using Hide buttons provided in the right side of Navigation Sheet.

Page 11: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

What You Click is What You Get

Compared Report – It will generate Comparison of both the sources [Matched and Mismatched] with format

Compared > 2000 Rows Report – It will generate Comparison of both the sources [Matched and Mismatched] without format

Matched Report – It will compare both the sources and gives you only the Matched Records with format

Matched > 2000 Rows Report – It will compare both the sources and gives you only the Matched Records without format

Mismatched Report – It will compare both the sources and gives you only the MisMatched Records with format

Mismatched > 2000 Rows Report – It will compare both the sources and gives you only the MisMatched Records with format. It needs Mismatched Report to be generated before.

Mismatched_Cols Report – It will compare both the sources and gives you only the MisMatched Records with format. It needs Mismatched > 2000 Rows Report to be generated before.

Mismatched_Cols > 2000 Rows Report – It will compare both the sources and gives you only the MisMatched Records without format

Stats Report – It will generate Status Report for Rows less than 2000, with format. It needs Compared and Mismatched Report to be generated before.

Stats > 2000 Rows Report – It will generate Status Report for Rows greater than 2000, with Format. It needs Compared > 2000 Rows Report and Mismatched > 2000 Rows Report to be generated before.

Page 12: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

PS: Less than 2000 and Greater than 2000 rows will be automatically generated once you click the Compared Report, Matched Report, Mismatched report buttons except for Stats Report, Stats > 2000 Rows Report and Mismatched_Cols > 2000 Rows Report.

Reports:

•Compared Report and Compared > 2000 Rows Report•Matched Report and Matched >2000 Rows Report•MisMatched Report and MisMatched > 2000 Rows Report •MisMatched_Cols Report and MisMatched_Cols > 2000 Rows Report •Stats Report and Stats > 2000 Rows Report

•Compared Report and Compared > 2000 Rows Report:

It will give results as Mainframe data, ETL data in a single cell with = or <> symbol. At the end of last columns of each rows status will be displayed as Matched or MisMatched

It will have Matched and Mismatched Data

Stats like number of matched and mismatched records for each columns, and link to other sheets will also be generated and will be printed after the last record.

Compared Report - For Less than 2000 Rows Compared > 2000 Rows Report - For Greater than 2000 Rows

•Matched Report and Matched > 2000 Rows Report:

It will give results as Mainframe data, ETL data in a single cell with "=" symbol. At the end of Last columns of each rows status will be displayed as Matched.

Page 13: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

It will have Matched Data only

Stats like number of matched records out of total records and link to other sheets will also be generated and will be printed after the last record.

Matched Report - For Less than 2000 Rows Matched > 2000 Rows Report - For Greater than 2000 Rows

Page 14: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

•MisMatched Report and MisMatched > 2000 Rows Report:

It will give results as Mainframe data, ETL data in a single cell with “<>" symbol. At the end of Last columns of each rows status will be displayed as MisMatched

It will have MisMatched Records only

Stats like number of matched and mismatched records for each columns, and link to other sheets Will also be generated and will be printed after the last record.

MisMatched Report - For Less than 2000 Rows & MisMatched > 2000 Rows Report - For Greater than 2000 Rows

•MisMatched Cols Report & MisMatched Cols > 2000 Rows Report:

It will give results as Mainframe data, ETL data in a single cell with “<>" symbol. At the end of last columns of each rows status will be displayed as MisMatched.

It will have MisMatched Records and Mismatched Columns only

Stats like number of Mismatched Cols, and link to other sheets will also be generated and will be printed after the last record.

MisMatched Report required for Mismatched_Cols Report MisMatched > 2000 Rows Report required for Mismatched_Cols > 2000 Rows Report

•Stats Report and Stats > 2000 Rows Report:

List of Mismatched columns, Column Numbers, How many mismatched values for each column will be displayed.

Summary like Total Rows and columns, Total Mismatched Rows and Columns, Matched Rows and Columns and Percentage for each will also be displayed.

Links to other sheets will be displayed in the bottom of report

Compared Report and Mismatched Report required for Stats Report Compared >2000 Rows Report and Mismatched >2000 Rows Report required for

Stats >2000 Rows Report.

Page 15: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

•File Activities

Load Sample Data - It loads data from sample data sheet to both

Mainframes and ETL sheets

Load External Data - Load External data from ANY delimited file or

Normal Text files into the sheet which you

choose

Clear Source Contents - It clears the Contents in Sources [Mainframe and

ETL Sheets]

Delete Reports - It deletes the Generated Reports irrespective of

the number of reports.

Save Sources & Reports - It saves only the Sources [Mainframe and ETL]

and the Generated Reports

Save ComparX - It saves the entire Application where in the

ComparX is actually stored.

Get Record Count - Gets you the Row and Column count comparison

in a Popup dialog box.

Get Report Count - Gets you the Available Reports Count and Its

names in a Popup dialog box.

•Formats

Export to HTML - Exports the chosen report into HTML format. If the report not generated it wont export and will throw popup

Export to CSV - Exports the chosen report into CSV

Format. If the report not generated it won’t export and will throw popup

Export to TXT - Exports the chosen report into TXT format with delimiter. If the Report not generated it wont export and will throw popup

Page 16: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

You can export Matched, Mismatched, and Compared Report into the mentioned format. Generated file will be saved where the ComparX-R is placed. But in the folder of CSV, TXT for CSV and TXT files.

All Exported files [CSV, TXT, HTML] will be saved where in the ComparX is stored.

Format Reports - You can format Matched, Mismatched, and Compared report into the mentioned format. But not Mismatched_Cols Report and Stats Report as it comes with formats.

•Navigator

It navigates you to the specific report whichever generated.

Page 17: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Reports:

Compared Report and Compared > 2000 Rows Report:

[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed in Single cell. [MF] <Mainframe data> <> <ETL data> [ETL] {Row #: | Column :} if both data are NOT Equal - This will be displayed in single cell.

Status Column will have values MATCHED if both data are matched, MISMATCHED if Both data are not matched.

Status Rows:

1st Row: No. of Rows in mainframe = No. of Rows in ETL Row count matched or not matched Completely matched Records

2nd Row: No. of Mismatched Records Report Generation time

3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.

Compared Report generates with format Compared > 2000 Rows Report generates without format

Matched Report and Matched > 2000 Rows Report:

[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed in single cell. Status Column will only have values MATCHED if both data are matched.

Status Rows:

1st Row: No. of Rows in mainframe = No.of Rows in ETL Completely matched Records

2nd Row: No.of Matched Records Report Generation time

Page 18: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.

Matched Report generates with format Matched > 2000 Rows Report generates without format

MisMatched Report and MisMatched > 2000 Rows Report:

[MF] <Mainframe data> <> <ETL data> [ETL] if both data are equal - This will be displayed in single cell. [MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed in Single cell. Status Column will only have values MISMATCHED if both data are NOT matched.

Status Rows:

1st Row: No.of Rows in mainframe = No.of Rows in ETL MISMATCHED Records COUNT

2nd Row: No.of MisMatched Records Report Generation time

3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.

MisMatched Report generates with format MisMatched > 2000 Rows Report generates without format

MisMatched_Cols Report and MisMatched_Cols > 2000 Rows Report:

[MF] <Mainframe data> <> <ETL data> [ETL] if both data are equal - This will be displayed in single cell. [MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed in single cell.

If Entire Column has matching value that column will be deleted Status Column will only have values MISMATCHED if both data are NOT matched.

Status Rows:

1st Row: No.of Rows in mainframe = No.of Rows in ETL MISMATCHED Records COUNT

2nd Row: No.of MisMatched Records Report Generation time

3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.

Page 19: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

MisMatched_Cols Report generates with format MisMatched_Cols > 2000 Rows Report generates without format

Stats Report and Stats > 2000 Rows Report:

Table 1

Mismatched Column's Number Mismatched Column's Name Mismatched Records for Each column

Total Mismatched Values <Sum>

Table 2

<Count of Records <Percentage of Records for each category> for each category>

Total Rows Total Columns Mismatched Rows Mismatched columns of Mismatched Rows Matched columns of Mismatched Rows Matched Rows Matched columns Link to Compare Data Menu, Link to Mismatched Report, Link to Compared Report

Page 20: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Features

1)Gives you the Statistics. This will be really helpful for analysis and presentation.

2)Gives you the Report Generation Time.

3)Gives the flexibility of Formatting the reports for presentation

4)Converting the Excel Data into HTML, CSV, Delimited TXT file formats

5)Gives you the Row and Column count to decide to whether to go for the result set or not.

6)Gives you the Reports Count [Generated Reports]

7)Navigation to all reports, Main Menu, etc.

8)Hide and Show options

9)Colors list generation to see what colors you can use for your report formatting.

10)Loading external delimited data files, Loading the sample data for testing

11)Save Sources and Reports alone in a new file.

12)All Reports, Exported files are saved in the same folder where in the ComparX is actually

saved.

13)Deletion of Reports.

Page 21: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Limitations

ComparX-R

•Can process 255 columns as excel column limit per sheet is 255 plus one status column.

•Can process 65533 rows as excel sheet limit is 65536 rows and status will take 3 rows.

•Row Count should be equal in both sources to get the desired result.

•Source files should be sorted properly.

•This is really helpful for Sequential Comparison and gives accurate result

•Not much faster than UNIX scripts or other scripts. As it has to compare each cell of one sheet

against each Cell in other sheet in a sequential manner. UNIX scripts will be running in servers

so it would be much faster.

•But Comparison wise it gives Accurate results with a good look and feel.

•Rejected records will not be saved in separate file.

•Even if its blank values in any column when comparing to other column with data, it displays

the status as Mismatched only.

Page 22: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

DifferenceBetween

ComparX-R and ComparX-C

ComparX-R

Can process 255 columns as excel column limit per sheet is 255 plus one status column. So it can process 255 columns from each source.

Can process 65533 rows as excel sheet limit is 65536 rows and status will take 3 rows.

ComparX-C

As it has to display 3 columns in reports for each set of columns it can process only 85 columns from each source.

Can process 65532 rows as excel sheet limit is 65536 rows and status will take 4 rows.

PS: ComparX-C is a new tool which has report logic and layouts completely different from ComparX-R and will be uploaded soon into MIGHTY.

Page 23: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan

Future Enhancements

•FTP options

•Export to XML

•Process more than 65533 rows and 255 Columns

•Array based Process [expedite the process]

•Lookup irrespective of Row counts matched or not

•Rejected data into a separate file

•Key based Search and Comparison

•Option for both Delimited and Fixed width files Processing

•E-Mailing the Generated reports

•Rejected records of both files will be placed in separate worksheets.

Page 24: Compar x r (row based comparison tool)

ComparX-R Senthil Sundaresan