Agility is the basic need in all kind of testing these days. For testers it is tough and challenging task to inculcate agility in ETL testing as data is always huge with aggregate and complex logics may be applied on the data.
- Ensuring Data Quality.
- Treating the ETL as a black-box, and focus on the known inputs and outputs.
- Techniques such as data sampling, Exhaustive verification and aggregation can be applied to achieve the target.
What is Agility in Testing?
Agility in DW testing refers to the testing done in a away that it is quick as well as covers all the requirement specified by the client i.e. testing that is before time and apt and adequate as per clients requirement.
The right technology will support an agility by making sure the data warehouse is built right, built fast and built to last.
For ETL testers the statement testing quickly would be of great importance as it would include:
- Testing as soon as build is ready
- Identification of gaps either in code or technology
- Finding and reporting defects as soon as possible
- Maintenance of Quality of testing in spite of tight timelines.
Biggest challenge is least documentation due to lack of time.
Why is agility Important in DW testing?
Data Warehouse consist of huge set of data, and complex aggregations logics hence agility is of great importance so that full volume of data is covered in less time. DW testing not only includes ETL but also includes Data mining, data cleansing and data conditioning.
We need to do Adhoc testing, Smoke testing, Regression testing, unit and integration testing as a part of Data warehouse testing.
Most important Point to be kept in mind while DW testing is Data Quality.
So that the Number of Cycles of testing Increases as it is really important to test many times in DW testing to ensure complete data coverage.
Especially in the situation where Requirements keeps on changing agility plays a real important role.
What should be the steps to ensure agility in DW testing?
- Testers should have their own/private copies of the database(s), and there should only be one process (e.g. a test) running against the database(s) at any given moment.
- Always keep the backup of tables in case of mocking up the data.
- Each test should target as few tables as possible to increase the speed of queries being executed at a time
- Modification of Data which is already populated should be avoided
- Revert changes made to the database(s) in the tear-down of each test (This applies to both the source and warehouse databases).
- Many tests should not be dependent on each other.
- All tests should target data mappings, calculations, and data formats (nullable, boundary, and data-types).
- Some tests should concentrate on initial loads.
- Some tests should concentrate on incremental loads.
- Some tests should concentrate on incremental loads for which there were no source database changes.
- When testing the business rules, treat the ETL as a black-box, and focus on the known inputs and outputs as much as possible.
- There should be proper sync up between development and testing environment to avoid gaps during testing.
- Proper DDL validations should be there to avoid mismatches between filed of table as for development team its just change of field but for tester its huge amount of rework.
- Usage of sampling technique as much as possible.
- Test the data with data flow of the application at different stages.
- For the test cases which are to be executed every time regression testing is done should be automated.
The major challenges in DW testing are:
- Lot of dependency on development team hence to maintain a proper Sync-up is a tough task.
- Data is huge and aggregate with complex transformation logics due to which SQL queries take lot of time in execution.
- Business requirements that change on a monthly or even daily basis.
We should try to automate as much as possible as there are instances where we use the same kind of SQL queries for many test case hence we can automate them.
Automation of regression test cases is essential. Tool like Perfaware can also be used for automation.
This is guest post from Geetanjali Bhatia. If you would like to submit your post, please check Submit Article page for the guidelines.