This is the Guest Post from our author Geetanjali.
A Data Warehouse Is a Structured Repository Of Historic Data. It is usually:
- Subject Oriented
- Integrated
- Time Variant
- Non-volatile
Data warehouse testing has many challenges in its way as we need to test millions of record at a time. It is very important to follow the right procedure for testing so that it does not end up with confusion and end to end testing is successfully done.
Let us discuss about few challenges in data warehouse testing with the Proper solution to it.
Challenges |
Solutions |
Lack of exhaustive test data plans is the Major challenge as client may not be comfortable in providing access to production data for testing purpose or sufficient data is not available from various systems for testing. |
Clearly identify who will be responsible for providing the test data. Take backup of the tables & data before executing any operations or inserting data so that in case of any issues important data can be recovered. Define the Back up and clean up procedure to avoid ambiguity. |
Since there is lot of data for testing in various tables and business logic is complex hence complete testing may not be practically possible due to time and budget constraints whereas client may expect the testing team to use all the available test data for testing |
We should use sampling technique for optimized coverage and for reducing testing time. We can also do automated data comparison with homogenous/heterogeneous data sources. |
Data Warehouse Testing needs to deal with large volumes of data from multiple systems. Lot of data issues may be reported initially & if these are not resolved quickly it may have an impact on project schedule |
It is always better to do sampling to solve the data issues. |
Since data warehouse has complex architecture a small change may have an impact at multiple places |
Regression testing approach should be properly defined. |
If data volume is not considered for arriving at the configuration of the infrastructure needed for the project then there could be delays due to long procurement cycles and/or infrastructure level bottleneck during test execution phase as normal desktop machines may not be able to handle the magnitude of data. |
Gather information about the data volume in the initial stages itself so that the cost and timelines for procuring the machines can be taken into consideration to arrive at the realistic schedule. In order to meet these challenges what we can do is to proceed with ETL testing with a Modular approach. So that we come across minimum challenges while proceeding with the ETL testing. |
Subscribe to our QAInsights YouTube Channel. If you are enjoying our work, please do subscribe our free weekly newsletter or Google feeds.
Try out the tool QuerySurge. Very helpful in testing data warehouses
Many thanks Scott. I will check it out. Thanks!