If you’re not familiar with Bloor, it’s an independent research and analyst house founded to help organizations choose optimal technology solutions.
As database teams grapple with shortening release cycles and tightening data protection laws, the need to deliver realistic and compliant test data to development quickly and safely is greater than ever. Choosing the wrong approach can hamper development, drive down quality, risk non-compliance, and lead to escalating infrastructure costs.
In this post, I’ll summarize some of the key findings from the report and shed some light on that all-important score of 4.5 for SQL Provision.
In the guide, Bloor defines three primary methods of test data management: data subsetting, data virtualization, and synthetic data generation:
Data subsetting consists of taking a subset from one or more production databases, usually of a much smaller size than the database(s) as a whole.
Data virtualisation has a similar motivation to data subsetting, at its core: take large production databases and make them easy and efficient to distribute and test with. However, where data subsetting does this by reducing the amount of data being bandied around, data virtualisation does it by allowing you to create virtual copies of your databases.
Synthetic data generation breaks with data subsetting and data virtualisation by opting to disregard your production data for use as test data. Instead, it allows you to create your own ‘synthetic’ test data in an automated fashion.
The analyst finds that the reduced size of the dataset achieved with data subsetting brings advantages in terms of ease and speed of distribution – but also that the method brings challenges around ensuring the data subset is realistic and referentially intact. With data virtualization, however, he finds it provides the ability to achieve a small and lightweight dataset that is also fully representative.
When it comes to synthetic data generation, he describes the main advantage is its failsafe way of ensuring no sensitive information is present (because the data is complete fake). But like data subsetting, arriving at a realistic dataset can be challenging.
For data subsetting and data virtualization, sensitive data will need to be obscured as part of the process, and the analyst mentions static data masking as a way to achieve this.
In terms of the market trends driving test data management approaches, the analyst highlights the extreme importance of sensitive data thanks to GDPR as well as upcoming regulations, and notes that this is driving both data masking and synthetic data generation capabilities.
An important trend for database development teams is the increased emphasis on test data provisioning, as opposed to merely test data management, driven by DevOps and Agile development practices:
The idea is to provide not only a way to create test data, but a method of distributing it effectively and efficiently to your testers, often by means of self-service. The advantage here is a significant improvement to the tester experience and to testing efficiency, thus (one hopes) preventing test data as a whole from becoming a bottleneck to your continuous testing, test automation, or DevOps pipelines … Test data provisioning particularly benefits from a data virtualisation capability.
Redgate is listed as an innovator in the report, and SQL Provision achieves a score of 4.5 stars out of 5 for test data provisioning. The accompanying InBrief explains the reasoning behind this:
Redgate’s entire approach hinges on two concerns: compliance (with existing mandates, such as GDPR, as well as forthcoming regulations) and DevOps. Effective test data management is essential for achieving both of these. Desensitising your test data is necessary for compliance with a variety of mandates, as well ensuring data privacy and security (protection from data breaches, for instance), while timely provisioning of test data – delivering the right test data to the right place at the right time – is an important component of any DevOps pipeline. Consequently, SQL Provision provides both of these capabilities.
Moreover, SQL Provision does so using a combination of database cloning and data masking. This has some clear advantages over competing approaches, such as data subsetting or synthetic data generation, most of all that it guarantees that your test data will be representative. It also makes provisioning that test data fast and easy.
What’s more, Redgate is uniquely positioned in offering a solution based on database cloning to the mid-market, whereas competing products tend to be targeted at the high-end.
Organizations such as PASS and KEPRO, are already taking advantage of the benefits a combined database cloning and data masking approach brings. By implementing SQL Provision, they’ve been able to solve the major challenges involved with provisioning realistic and compliant test data to development quickly and safely.
SQL Provision also opens new opportunities for database teams, such as enabling the transition from shared to dedicated development environments, shift-left testing, and refreshing data on-demand through self-service and automated processes.
It’s good to see that the advantages it brings to database development have now been recognized by an independent analyst.
To learn more, download your free copy of the Bloor 2019 Market Update for Test Data Management.
Was this article helpful?