20
How To Compare Microsoft Excel Worksheets With Florencesoft DiffEngineX Overview DiffEngineX reports the differences between two Microsoft® Excel workbooks. You can choose whether to compare every worksheet contained in the two books (select Whole Workbooks) or just selected sheets (select Selected Sheets). If you compare at the Whole Workbook level, worksheets with the same name are automatically compared with each other. If you want to compare two sheets with different names you must select Selected Sheets and make sure only one sheet is selected in each of the list boxes. DiffEngineX only reports on the differences between the formulae and constants found in worksheets. It does not report on cell comment differences, nor does it compare charts. Every comparison will generate a new workbook listing each cell difference. This new workbook will contain one worksheet for every pair of sheets compared. Its very last worksheet is a summary of the number of different cells found in each pair. DiffEngineX does not modify the workbooks you select to be compared in any way. However when some of its options are selected it will create in-memory copies of your workbooks and modify these instead. DiffEngineX needs to modify these copies as part of the work done for Color Differences, Align Rows and Align Columns. A step-by-step tutorial on how to compare two lists of data can be found at the bottom of this help page. Click here to get the free trial version of DiffEngineX. Important If it seems DiffEngineX is failing to spot similarities between worksheets and as a result is reporting spurious differences, you will need to select Align Columns and/or Align Rows . Alignment is the insertion of blank rows and/or columns such that the identical cells in two sheets end up with the same row and column numbers. Identical cells are only recognised for what they are if they have the same row and column numbers.

How to Compare Excel Workbooks

  • Upload
    mars

  • View
    4.630

  • Download
    3

Embed Size (px)

DESCRIPTION

How to find the differences between two Microsoft Excel Worksheets using the tool DiffEngineX

Citation preview

How To Compare Microsoft Excel Worksheets With Florencesoft DiffEngineX

Overview

DiffEngineX reports the differences between two Microsoft® Excel workbooks. You can choose whether to compare every worksheet contained in the two books (select Whole Workbooks) or just selected sheets (select Selected Sheets). If you compare at the Whole Workbook level, worksheets with the same name are automatically compared with each other. If you want to compare two sheets with different names you must select Selected Sheets and make sure only one sheet is selected in each of the list boxes.

DiffEngineX only reports on the differences between the formulae and constants found in worksheets. It does not report on cell comment differences, nor does it compare charts.

Every comparison will generate a new workbook listing each cell difference. This new workbook will contain one worksheet for every pair of sheets compared. Its very last worksheet is a summary of the number of different cells found in each pair.

DiffEngineX does not modify the workbooks you select to be compared in any way. However when some of its options are selected it will create in-memory copies of your workbooks and modify these instead. DiffEngineX needs to modify these copies as part of the work done for Color Differences, Align Rows and Align Columns.

A step-by-step tutorial on how to compare two lists of data can be found at the bottom of this help page.

Click here to get the free trial version of DiffEngineX.

Important

If it seems DiffEngineX is failing to spot similarities between worksheets and as a result is reporting spurious differences, you will need to select Align Columns and/or Align Rows. Alignment is the insertion of blank rows and/or columns such that the identical cells in two sheets end up with the same row and column numbers. Identical cells are only recognised for what they are if they have the same row and column numbers.

When Align Columns is selected you will be asked to select a row (or the rows) containing column headings, after the Start Comparison button has been pressed. If the sheets being compared do not have column headings, select a row present in both sheets that has the same meaning and same row number.

When Align Rows is selected you will be asked to select a column (or the columns) that will be looked at during the process of row alignment, after the Start

Comparison button has been pressed. Selection of columns A and/or B will work for most cases. If this is not appropriate, select the first non-blank column present in both sheets that has the same meaning and column number.

It is recommended the Color Differences is always selected as it offers a much clearer way to see differences compared to a cell-by-cell listing.

Two important important options to find out about are Compact Like Changes When Contiguous and The Actual Formulae/Their Calculated Values.

International Users

If your current regional settings do not match the language version of Office/Excel installed, DiffEngineX may not work correctly.

To prevent problems it is recommended you select the option Ensure application works when Excel language version not equal to Regional Settings.

Difference Report

The difference report is generated in a new workbook. It contains one worksheet for every pair of worksheets compared. Its last sheet is a summary of the number of differences found.

If row or column alignment is selected, the first entries show where the new blank rows and columns have been inserted. They are inserted to maximize the number of identical cells having matching co-ordinates between each pair of sheets. Cells with identical content will be flagged as different unless they share the same co-ordinates. The insertion of blank rows and/or columns by DiffEngineX indicates the changes between the two worksheets have included the addition or deletion of rows and columns.

The next entries list the cell differences. They are organized into five columns.

The first column contains the addresses of cells found to differ. (Each entry is the address of a single cell unless the option Compact Like Changes When Contiguous has been selected.) If row/column alignment is selected, the addresses refer to the workbook copies made by DiffEngineX. These addresses will differ from the ones in your original workbooks if blank alignment rows/columns have been inserted.

The next two columns quote the cell content found to differ.

The last two columns are only relevant if row and/or column alignment is selected. They contain the cell addresses of the different content in the original workbooks

selected for comparison. These are unaffected by the insertion of blank alignment rows and columns. Blank rows and columns are only inserted in workbook copies. If you are referring back to your original workbooks you should use the values in these two columns rather than the very first one.

Extras

When Color Differences is selected, the cells that differ between two sheets are highlighted with color. When the comparision has finished, copies of the two workbooks you selected are generated with the added color, in addition to the difference report workbook. Once again these are _copies_ of your workbooks and will not be saved to your hard drive unless you ask Microsoft® Excel to.

The Extras dialog allows you to specify what colors are used. It is invoked by pressing the Extras button.

In the Extras dialog, a deleted cell is defined as a cell with content in Workbook #1, but no content in Workbook #2. An addition is defined as a cell with content in Workbook #2, but no content in Workbook #1. In this respect Workbook #1 can be considered the original workbook and Workbook #2 the modified copy.

Existing Color Removal

Existing workbook color may make inspection of the results difficult. The Extras dialog offers you the option of removing unconditional color from the workbook copies before it starts to highlight the differences. Color is not removed from the original workbooks you select.

Existing Hidden Sheets/Cells

Excel allows spreadsheet authors to hide whole rows and columns. Additionally whole sheets may be made invisible. Differences may occur in these hidden regions. Obviously there is little point in coloring a pair of cells if the results cannot be seen. An option exists to unhide sheets, rows and columns on the workbook copies. The original workbooks are not modified by selecting this option. Note that selecting this option will only have an effect if Color Differences is also selected on the main part of the user interface.

Hide Matching Rows

If large worksheets are compared and the different rows are widely separated, inspection of all the color highlighted rows may be difficult. An option exists to hide the matching rows, just leaving the different rows visible. Selecting Yes for this option will hide all matching rows. Selecting Yes, but show 4 rows on either side of each differing row as context will leave some matching rows visible.

Note that selecting this option will only have an effect if Color Differences is also selected on the main part of the user interface.

If you require the individual characters that differ between cells precisely highlighted on the difference report, see the option Color in Red Precisely The Parts of Formulae and Text Constants That Differ.

Align Columns

If your are comparing an original worksheet to its modified copy and part of the modifications included the insertion or deletion of columns you will need to select Align Columns.

Without Align Columns similar regions in the sheets will be incorrectly reported as different just because the same content is shifted to the left or right.

When the Align Columns checkbox is selected you will be asked to specify what rows are used to help with column alignment after pressing the Start Comparison button. A good choice is to select the row containing column headings, if one exists. At the very least select the first row that contains content.

If the modifications that have taken place do not include the insertion or deletion of columns this option should not be selected.

Align Columns works by inserting blank columns into the workbook copies being compared. These blank columns are color highlighted. The color used can be specified with the Extras dialog.

Align Columns Example

Consider the workbooks original.xls and modified.xls shown in Figure 1. You can see the new Website column has been inserted into modified.xls.

original1.xls and modified1.xls are the result of a comparision without column alignment. You can see the Personal and Work Emails have been incorrectly flagged as different.

original2.xls and modified2.xls are the result of a comparison with column alignment. Row 1 was added to the Selected Rows list using the Align Columns dialog box. Row 1 contains the column headings of First Name, Last Name, (Website), Personal Email and Work Email. The new Website column cells are correctly flagged as new content by use of the color green.

Align columns also works when columns have been deleted.

Figure 1.

Align Rows

If you are comparing an original worksheet to its modified copy and part of the modifications included the insertion or deletion of rows you will need to select Align Rows.

Without Align Rows, similar regions will be incorrectly reported as different just because the same content is shifted up or down.

When the Align Rows checkbox is selected you will be asked to specify what columns are used to help with row alignment after pressing the Start Comparison button. A reasonable choice is to select the first column with content.

If the modifications that have taken place do not include the insertion or deletion of rows this option should not be selected.

Align Rows works by inserting blank rows into the workbook copies being compared. These blank rows are color highlighted. The color used can be specified with the Extras dialog.

Align Rows Example

Consider the workbooks original.xls and modified.xls shown in Figure 2. You can see a new row with the content { 4400, Sports Car, 999 } has been added.

original1.xls and modified1.xls are the result of a comparison without row alignment. You can see three rows have been incorrectly flagged as different, when only 1 new row was added.

original2.xls and modified2.xls are the result of a comparison with row alignment. Column A was added to the Selected Columns list using the Align Rows dialog box. Column A contains order numbers that uniquely describe the contents of each row. The new row is correctly flagged as new content by use of the color green.

Align Rows also works when rows have been deleted.

Figure 2.

Options

Compact Like Changes When Contiguous

Selecting this option can potentially reduce the verbosity of DiffEngineX reports.

For example if three adjacent cells contain equivalent content and they are all changed to the same formulae or constant, the change is reported on one line instead of three.

For example

*

will be listed instead of

E2 =A1*3 =A1*9

F2 =B1*3 =B1*9

G2 =C1*3 =C1*9

*For multi-cell ranges of equivalent formulae, the one A1 style formulae shown is relative to the first cell of the range.

Color Alternate Rows

Selecting this option makes difference reports easier to read as every other row is color highlighted.

Color in Red Precisely The Parts of Formulae and Text Constants That Differ

Selecting this option highlights the exact parts of formulae and text constants, with the color red, that differ between two worksheets. The highlighting is applied to the cell content quoted on the difference report.

Dates and numeric constants are not covered by this option.

This option can slow down comparisons when the number of cell differences is large. It is recommended that Compact Like Changes When Contiguous is selected as well with this option in order to reduce the amount of work needed to be performed on each difference report.

E2:G2 =A1*3 =A1*9

This option should not be confused with Color Differences which is available on the main part of the user interface. Color Differences applies background color to whole cells on copies of the workbooks selected for comparision. The option discussed here applies foreground color to selected parts of formulae and text constants on the difference report.

An example of the precise highlighting offered by this option is shown below.

E1 =A1+Costs+4 =A1+NewCosts+6

G2 The quick cat. The slow cat.

A1 or R1C1 Notation

The reports DiffEngineX generates contain cell content where it is found to differ. If the differing content contains formulae this option allows it to be reported in either A1 or R1C1 notation. In A1 notation rows are labeled numerically and columns are labeled alphabetically. In R1C1 notation, both columns and rows are labeled numerically.

Case Insensitive Comparisons

Select this option if you want cell content to be compared without regard to its capitalization. For example when this option is checked the constant "Sales" will be treated as equivalent to "sales".

The Actual Formulae or Their Calculated Values

If two cells containing formulae are being compared, a choice has to be made whether to compare the actual formulae themselves or their calculated values.

For example if two cells containing =2*6 and =3*4 are compared with The Actual Formulae checked they will be reported as different. If Their Calculated Values is checked they will be reported as identical.

Ensure application works when Excel language version not equal to Regional Settings

If your Control Panel Regional Options (such as French (Canada), Italian (Italy) etc.) do match the localized language version of Excel you have installed, DiffEngineX will generate error messages each time it is run.

To prevent problems you may wish to consider one of the below.

Purchase a localized language version of Excel that matches your Regional Options.

Change your Control Panel Regional Options to match the language version of Excel.

Check the DiffEngineX option Ensure application works when Excel language version not equal to Regional Settings.

Changing your Control Panel Regional Options is not recommended as it has wide ranging effects.

Only check the DiffEngineX provided option if you encounter problems.

Figure 3.

Command Line Arguments

If command line arguments are supplied to DiffEngineX, it will compare workbooks without its user interface being displayed. This can be useful if you wish to compare multiple workbooks one after another using a series of commands stored in a *.bat file.

You must first locate the DiffEngineX.exe file. Typically it will have the location specified below.C:\Program Files\Florencesoft\DiffEngineX\DiffEngineX.exe

Values are passed to DiffEngineX by means of switches. Each switch is prefixed with a forward slash / and is separated from its associated value or values by a colon :.

Some switches are associated with a single value. Others are associated with a comma separated list of values. File names and comma separated lists containing white space characters must be enclosed with double quotation marks e.g.

The switch /sheets has been used to limit the comparison to the sheets Cash Flow, Notes and Annual Fin St.

Although the examples shown here are split across several lines, ensure that each individual command does not contain newlines or carriage returns.

To display a list of all supported switches enter the following "C:\Program Files\Florencesoft\DiffEngineX\DiffEngineX.exe" /help from the command prompt.

The only mandatory switches are /inbook1, /inbook2 and /report. Typically you will want different cells to be color highlighted, identical changes grouped together (when adjacent) and the results saved to disk. Your original input workbooks are not modified by the color highlighting. The changes are made to copies. DiffEngineX will never overwrite existing files. To ensure your commands are not interrupted you should explicitly delete old reports beforehand. The below example achieves this.

The colors used to indicate modified, deleted and added cell content can be individually specified. If a color is not specified on the command line, the one used

by the user interface is taken. The example command below additionally specifies that existing workbook color be removed using the switch /removeexistingcolor.

The available colors (1 - 56) are shown in the palette below.

Figure 4.

Switches If a switch accepts a Boolean true or false value, then omitting the value is equivalent to specifying true i.e. /colordifferences:true is the same as /colordifferences, but not the same as /colordifferences:false.

Example Switch and Value DescriptionAction when

Switch Omitted

/inbook1:"myworkbook1.xls"Path and file name of 1st book to compare.

Mandatory

/inbook2:"myworkbook2.xls"Path and file name of 2nd book to compare.

Mandatory

/report:"mydiffreport.xls"Path and file name of output difference report.

Mandatory

/outbook1:"coloredcopy1.xls"

Path and file name of altered copy of 1st book to output. Different cells will be colored if /colordifferences specified as well.

No copy saved.

/outbook2:"coloredcopy2.xls"

Path and file name of altered copy of 2nd book to output. Different cells will be colored if /colordifferences specified as well.

No copy saved.

/alignrows:"A,B"

Comma separated list of alphabetical columns to examine when aligning rows (A - IV). A maximum of 5 columns can be specified. Typically either "A" or "A,B" will be specified.

Similar rows will not be aligned.

/aligncolumns:"1"

Comma separated list of numerical rows to examine when aligning columns. A maximum of 5 rows can be specified. Only use this if all your sheets have a distinct row containing column headings.

Similar columns will not be aligned.

/sheets:"Sheet1,Summary,Inputs"Specify this to limit comparisions to specific sheets.

All the matching sheets in the 2 workbooks will be compared.

/colordifferences:true or false

If true, different cells will be color highlighted. The /outbook1 and 2 switches must also be specified.

No action.

/removeexistingcolor: true or false

Remove unconditional fill color from cells in workbook copies to make color highlighting clearer. Note /colordifferences must also be specified.

No action.

/unhidesheetsrowscols: true or false

Hidden sheets, rows and columns are made visible in workbook copies so differences cannot be

No action.

obscured. Note /colordifferences must also be specified.

/hidematchingrows:n

n is integer (1 - 3). 1 hides matching rows. 2 hides matching rows except those near differing rows. 3 hides no rows. Note /colordifferences must also be specified. Takes precedence over /unhidesheetsrowscols.

Matching rows will not be hidden.

/compactchanges:true or false Equivalent changes to adjacent cells are grouped together in difference report.

No action.

/coloralternaterows:true or falseAlternate lines in difference report are colored.

No action.

/colorprecise:true or false

Text and formulae differences are highlighted at the character level with color red in different report. Time consuming option.

No action.

/stylea1:true or false

If true, formulae are listed using the A1 style in the difference report. If false, R1C1 is used.

A1 style is used when switch omitted.

/caseinsensitive:true or falseIf true, strings are compared without regards to case.

If omitted, case sensitive comparisions are used.

/compareformulae:true or falseIf true, formulae are directly compared, rather than their calculated end results.

If omitted, formulae are compared.

/ensureworksinternationally:true or false

If true, DiffEngineX will work despite Control Panel Regional Options differing from the language version of Excel.

DiffEngineX will fail when Regional Options do not equal Excel language.

/modifiedcolor:nn is integer (1 - 56). /colordifferences must be specified as well.

User interface color used.

/deletedcolor:n n is integer (1 - 56). User interface

/colordifferences must be specified as well.

color used.

/addedcolor:nn is integer (1 - 56). /colordifferences must be specified as well.

User interface color used.

/alignrowcolor:nn is integer (1 - 56). /colordifferences must be specified as well.

User interface color used.

/aligncolcolor:nn is integer (1 - 56). /colordifferences must be specified as well.

User interface color used.

/help or /h or /?Displays the available switches. Any other switch, if specified as well, is ignored.

No action.

Tutorial: How to Compare Two Excel Lists

A common business problem often concerns finding out what names and addresses appear in one list but not another. After the new data has been identified it is useful to be able to extract it into a new Excel workbook.

DiffEngineX can do the bulk of this type of work. Knowing a few Excel tricks and what options to select in DiffEngineX can greatly improve the end results.

Consider the two lists shown below. Even though DiffEngineX has the capability to align similar rows it needs some help from you first. This is because some of the changes involve not just the vertical displacement of rows, but a reordering. In the first list the "Dobbs, Bob" row is before the "Rivers, Doreen" row. In the second list the order has been reversed.

Figure 5 - Two Lists

DiffEngineX will insert blank rows to get existing rows to match up, but it will not reorder them.

To get around this problem you should ask Excel to sort your lists before using DiffEngineX to compare them. (Sorting is an optional step. Your data may not require it.)

Below we see our two original lists after Excel has sorted them on last and first name. Alternatively we could have sorted them by their ID column.

Figure 6 - Two Lists Sorted by Excel

Step-by-Step Instructions

1. First Sort using Excel & Save: Use Excel to open the two workbooks you want to compare. Click on any cell in the first list. Now click Excel's Data menu (or tab in Excel 2007) and select the Sort item. The Sort dialog will now appear. Sort by Last Name and then by First Name. Hit OK. Now do the same for the second list. In your lists you can

sort on any combination of columns that uniquely identifies each row. (Sorting is not always necessary.)

2. Save both your sorted workbooks (under different filenames if you prefer) before closing them.

3. Start up DiffEngineX - Use Options, Extras & Align Rows: Invoke DiffEngineX and click the Options button. In our example we can see that some street addresses are in upper case and others in lower case. Here we don't want such a trivial change to been counted as a modification and so we select the Case Insensitive Comparisons checkbox. Click OK to dismiss the dialog box.

4. Click the Extras button. In our example both our lists are small, but in real life some lists may contain tens of thousands of rows and have hundreds of differences between them. DiffEngineX uses color to highlight differences in automatically made copies of the workbooks it compares. We don't want to have to fish through thousands of rows just to see a few differences. Make sure the Yes option is selected for Hide Matching Rows. Click OK to dismiss the dialog box.

5. Select Align Rows on the main part of DiffEngineX's user interface. Ensure the Color Differences box is checked. Use the Browse buttons to point to your sorted Excel workbooks. Click the Start Comparison button.

6. We now have to tell DiffEngineX what columns uniquely identify each row. As we previously sorted on Last Name & First Name we select columns B and C before clicking the Add button. Hit OK to dismiss the dialog and start the comparison.

7. The results are shown below in figure 7. We can see DiffEngineX has correctly spotted the three new rows. However the matching rows are still in this workbook. They are only hidden. (You can see that Excel is not showing rows 1, 2, 4, 5 and 7.)

8. The Final Step: Separate the Wheat from the Chaff: Select the Excel worksheet containing the color highlighted new rows. Click Excel's Edit menu and select Go To. Click the Special... button. Click Visible cells only. (If you are using Excel 2007, select the Home tab. Then select Go To Special... from the Find & Select drop-down menu. Select Visible cells only.) Hit OK. Select Edit--->Copy. You have now selected and copied just the visible, new rows.

9. Create a new Excel workbook and use Edit--->Paste to copy across just the new rows. You now have separated the new rows from the hidden matching rows.

Note: For more complicated examples than shown here, rows may end up being colored red, green or purple by default to indicate differences. The colors red and green are used to indicate after row alignment one of two corresponding cells is blank. Purple means a cell has content in both sheets. You will have to inspect both the color highlighted sheets to find out all the differences.

Figure 7 - DiffEngineX Hides Matching Rows

Figure 8 - Use Excel's Edit--->Go To--->Special--->Visible cells only before Copy & Paste