4 steps to manage your test data


Every tester needs data to develop and test the quality of software and applications.

Test data can be created manually, by using a data generation tool or it can be retrieved from existing production environment.

This data doesn’t just happen; it needs to be managed properly in order to be useful for tests. Test data management can be divided into 4 steps:

Data Knowledge

Insight in your data model is necessary for creating a proper test data set. Many testers have a good understanding of their data, but a tool can also help to discover the data that is stored in the database.

Profile data for finding privacy sensitive data, visualize data dependencies and find data anomalies to improve test data requirement.

Subset Datasets

As said in the intro, test data can be created manually, by generating data or it can be retrieved from existing production environments.

Manually creating or synthetically generating data is only doable when you have a few tables. When the number of tables is growing, it’s getting more and more difficult. That’s why many organizations use a (100%) copy of production, even though its pretty outdated.

Most organizations don’t need all the data they have stored in their non-production environment and it’s costing them money. Using subsets instead will result in test data sets that contain all the needed test cases but it won’t affect storage capacity.

Mask Your Data

Test data that is retrieved from production - subsetted or not - may contain privacy sensitive information.

To protect personal identifiable information (PII), data needs to be anonymized or masked before it may be used for purposes like testing and development.

Data can be masked with the help of masking rules and synthetic data generation.

A good data masking tool combines several techniques to build a proper masking template.

Automate Test Data

Research shows that a significant aspect of software development time (including testing) is lost waiting for test data refreshes.

The reason for this is that the request for a refresh is a unnecessarily complicated and thus time consuming process, as shown in the picture below.

Why does it take so much time? Because it takes so many people! If Dev, Test and QA could only manage their own test data, a lot of time would be saved.

With the help of a test data management tool, testers can refresh their own data set via the self-service portal. Or it can be integrated with tools to automate the test data provisioning (and subsetting and masking data can be automated as well).

Test Data Management

It is important that test data is highly available and easy to refresh to improve the time to market of your software.

When test data is easy accessible and testers are able to self-refresh their test environments, the entire software development cycle will benefit.

You need to be in control of your test data if you want to start with continuous integration or continuous deployment.

For more information about test data management, visit https://www.datprof.com.