This is the Guest Post from our author Geetanjali.
A Data Warehouse Is a Structured Repository Of Historic Data. It is usually:
- Subject Oriented
- Time Variant
Data warehouse testing has many challenges in its way as we need to test millions of record at a time. It is very important to follow the right procedure for testing so that it does not end up with confusion and end to end testing is successfully done.
Let us discuss about few challenges in data warehouse testing with the Proper solution to it.
|Lack of exhaustive test data plans is the|
Major challenge as client may not be comfortable in providing access to production data for testing purpose or sufficient data is not available from various systems for testing.
|Clearly identify who will be responsible for providing the test data.|
Take backup of the tables & data before executing any
operations or inserting data so that in case of any issues important data can be recovered. Define the Back up and clean up procedure to avoid ambiguity.
|Since there is lot of data for testing in various tables and business|
logic is complex hence complete testing may not be
practically possible due to time and budget constraints
whereas client may expect the testing team to use all the available test data for testing
|We should use sampling technique for optimized coverage and for|
reducing testing time. We can also do automated data comparison
with homogenous/heterogeneous data sources.
|Data Warehouse Testing needs to deal with large volumes of data from multiple systems. Lot of data issues may be reported initially &|
if these are not resolved quickly it may have an impact on project schedule
|It is always better to do sampling to solve|
the data issues.
|Since data warehouse has complex architecture a small change|
may have an impact at multiple places
|Regression testing approach should be|
|If data volume is not considered for arriving at the configuration|
of the infrastructure needed for the project then there could be
delays due to long procurement cycles and/or infrastructure level
bottleneck during test execution phase as normal desktop
machines may not be able to handle the magnitude of data.
|Gather information about the data volume in the initial stages itself|
so that the cost and timelines for procuring the machines can be
taken into consideration to arrive at the realistic schedule.
In order to meet these challenges what we can do is to proceed
with ETL testing with a Modular approach. So that we come across minimum challenges while proceeding with the ETL testing.