Upload
phamnguyet
View
220
Download
5
Embed Size (px)
Citation preview
Easy, Quick & EffectiveEasy, Quick & Effective
Table validation (DataTable validation (Data--Warehouse)Warehouse)
Presenters: Santhosh Venkatesh and
Manmohan Muralidhar
Abstract
For any Data Bases or Data Warehouses, its building blocks are the
tables created in them. So for any DB/DW projects, the foremost test is to
validate any new/altered tables in the DB.
Validating a table for its structure (like naming, columns, data type, length,
precision, scale, null ability, primary keys, partition keys) is a pretty
monotonous manual task for any tester.
By automating this process, we are focusing on saving huge manual effort
over a period of time, avoiding user oversight leading to defect leakage and
most importantly makes more time available for actual functional testing.most importantly makes more time available for actual functional testing.
Introduction
The main reason for testing the new/altered tables for its structure is
to make sure, the code built over the tables meet the requirements & design of
the code, such that it would not fail or hinder the functionality at any case.
Since manual validation is prone to user error and is time consuming, here in
this presentation we are looking into two methodologies.
1. Table Validation using MS Excel Macros & Shell Scripting1. Table Validation using MS Excel Macros & Shell Scripting
- by Manmohan
2. Table Validation using QTP
- by Santhosh
Common Testing Methodology
Common Test approach for validating the tables, is to manually
compare the table design/requirements with the Description of the Table pulled
from the DB.
This when followed in data intensive projects (like our Data Warehousing and
Business Intelligence), where in we would be creating/altering 10-100 tables
(with ~100 to ~4000 columns), the manual approach would be a monotonous
task, consuming most of the resources and time in validating all the
created/altered tables.
Manual
Comparison
Pros & Cons
Pros:
1. Easy to implement
Cons:
1. Prone to user errors
2. Time consuming
3. Non Repeatable
Table Validation using MS Excel Macros & Shell Scripting
In this methodology, the requirements/design for each of the tables
are used to generate SQL scripts using Excel Macros and the generated SQL
scripts are then run using Shell script on a UNIX machine with access to the
DB in question to find all the discrepancies.
Below is a symbolic representation of this process,
Table Validation using MS Excel Macros & Shell Scripting
Below are the steps that need to be followed in this methodology,
1. Import the Requirements/Design into the Excel for each of the tables in question.
• The table structure needs to be defined in the excel in a particular format, with no required fields empty.
2. Run the Macro to generate the SQL scripts to a location on your local system.
• This will generate 2 SQL scripts to a location specified by you, which would then needs to be run using the Shell script.
3. FTP the Scripts generate in Step 2, with the included Shell script to a common
location on a UNIX machine with access to the DB in question.location on a UNIX machine with access to the DB in question.
4. Run the Shell script in the UNIX box.
• Once the Shell script is run, the script would ask the user for DB credentials and then would run the SQL script on that DB.
5. Validate the Log files generated by the Shell script.
• The Shell script upon running the SQLs, would generate log files and then will cleanse the log files for better reading and will place all the discrepancies found in a negative log.
Pros & Cons
Pros:
1. Accurate and Effective validation in quick time: As we will be comparing all the column attributes one-by-one, there are no chances of missing any of the attributes. It does effective and accurate validations in very quick time.
2. 95% Effort Saved: About 95% effort required for the DB validation will be saved, compared to manual. Effort saved is directly proportional to the number of tables/columns.
3. Time availability: This makes more time availability for actual functional testing.
4. Easy to implement: Since this method is implemented on commonly available tools, no special tool needs to be procured for implementation.
5. Simple and Easy to use: As there is no complexity in understanding, it is very simple and easy to 5. Simple and Easy to use: As there is no complexity in understanding, it is very simple and easy to
use. Since this requires NO specific tool knowledge, anyone who wanted to validate table
structures with access to a UNIX machine (with DB access) can use this tool (be it System
Engineers, Analysts, Developers, etc).
6. Cost savings: Over a period of time you will be able to save cost.
7. Repeatable: When a change/fix is applied to any of the objects by the DBA, the same script can be reused to do the complete validation again and thus can be used in Regression as well.
Cons:
1. Semi Automated: This method involves user intervention to FTP files to Unix machine and to run the Shell script.
2. Not compatible with HP QC/ALM
Case StudyProject: ERP Wave 1 – Fireworks & BidMaster
Requirement from DB perspective: A total of ~300 new tables (with ~5000 columns)
In this particular project for DWBI, we had to create 140 new ETL data flows with ~300 new tables with ~5000 columns, with a total test effort of ~800 hrs with 3 testers.
•If we would have followed Traditional Manual method of validating the tables, 3 testers would have required ~20hrs to validate all the tables.
•Lets assume, there was a design change applied and few discrepancies found. If there were another delivery, the total time consumes would have been 20 + 20 (to validate after re-delivery).
•If in case, a tester overlooked few tables and caused few ETL Data Flows to fail while testing, it would require the testers another round of table validation after table fixes are applied and would would require the testers another round of table validation after table fixes are applied and would require few scenarios to be re-tested.
This brings the total effort for Table Validations to,
20 + 20 + 20 = 60 hrs * 3 (Testers) = 180hrs + 20 (for retesting) = 200 hrs.
By using this validation tool,
•The first validation would require, just one tester to setup the Macro (~4-6hrs), which is one time effort) and to generate & run the scripts, we would require a Max of another 1hr.
•Considering the same scenarios as earlier (1-2 hrs for creating the scripts again, due to change in requirement), the total test effort for validation would be.
(4 + 1) + (2 + 1) + 0 (since there would be no leakage) = 8 hrs
So total effort saved = 200 - 8 = 192 hrs (>95% effort saved)
Table Validation using QTP
In this methodology, the requirements/design for each of the tables
are stored in an Excel spreadsheet and then QTP script is run to compare the
table definitions from the Database using ODBC connection through QTP.
Below is a symbolic representation of this process,
Table Validation using QTP
Below are the steps that need to be followed in this methodology,
1. Provide Test data (table names) in excel sheet, which will be imported by the
script.
2. Provide the STTM/requirement information in an excel which will be used for
validation.
3. Run the script in QTP.
4. QTP will interact with the Database to get the table/column information.
5. Script will capture the DB results in the excel sheet.5. Script will capture the DB results in the excel sheet.
6. Compare the results
7. Validate the test results.
Pros & Cons
Pros:
1. Accurate and Effective validation in quick time: As we will be comparing all the column attributes one-
by-one, there are no chances of missing any of the attributes. It does effective and accurate validations in
very quick time.
2. 95% Effort Saved: About 95% effort required for the DB validation will be saved. Effort saved is directly
proportional to the number of tables/columns.
3. Time availability: This makes more time availability for actual functional testing.
4. Simple and easy: As there is no complexity in understanding and as there are only two inputs which need
to be provided, it is very simple and easy to use.
5. Increased Productivity Bandwidth: You can bolster your productivity over a time period.5. Increased Productivity Bandwidth: You can bolster your productivity over a time period.
6. Cost savings: Over a period of time you will be able to save costs to the project and to the organization.
7. Repeatable: When a change/fix is applied to any of the objects by the DBA, the same script can be reused
to do the complete validation again.
8. Fully Automated: This method is fully automated.
9. QC Integration: The test can be run through the QC.
Cons:
1. Availability of QTP tool.
Case StudyProject: CIRAS
Requirement from DB perspective: A total of 180 new tables (with ~4000 columns)
In this particular project for DWBI, we had to create 80 new ETL data flows with 180 new tables with ~4000 columns, with a total test effort of 600 hrs with 3 testers.
•If we would have followed Traditional Manual method of validating the tables, 3 testers would have required ~20hrs to validate all the tables.
•Lets assume, there was a design change applied and few discrepancies found. If there were another delivery, the total time consumes would have been 20 + 20 (to validate after re-delivery).
•If in case, a tester overlooked few tables and caused few ETL Data Flows to fail while testing, it would require the testers another round of table validation after table fixes are applied and would would require the testers another round of table validation after table fixes are applied and would require few scenarios to be re-tested.
This brings the total effort for Table Validations to,
17 + 17 + 17 = 51 hrs * 3 (Testers) = 153hrs + 17 (for retesting) = 170 hrs.
By using this QTP Script,
•The first validation would require, just one tester to setup the STTM in an excel spreadsheet (~1-2hrs), which is one time effort and to generate & run the scripts, we would require a Max of another 1hr.
•Considering the same scenarios as earlier (1 hr for modifying the input spreadsheet, due to change in requirement), the total test effort for validation would be.
(2 + 1) + (1 + 1) + 0 (since there would be no leakage) = 5 hrs
So total effort saved = 170 - 5 = 175 hrs (>95% effort saved)