4 February 2026

Your Test Data Environment: Build vs Buy – a conversation we need to have

After three decades of working with databases, one thing I’ve seen over and over is this: we don’t treat our development and test environments with the same respect we do our production systems.

Not because people don’t care. Far from it. It’s usually because teams are under pressure, everyone’s juggling multiple priorities, and the quickest path forward often wins the day.

Those lower environments usually contain copies of real customer data, though, and that makes them real targets for malicious actors.

Developers need realistic data to build reliable software. Testers need it to validate accurate logic and acceptable performance. And the business needs to stay compliant and avoid risk. That combination creates tension, and most organizations resolve it with what feels like the most straightforward fix.

Typical ways around this are anonymizing or generating data for test and development environments. Many organizations jump to tasking someone in the team to create this process. This typically isn’t their primary job, and a quick DIY script is chosen as the quickest, easiest option, without considering the potential issues of risk or governance.

Why DIY scripts seem easy… until they aren’t

When someone gets asked to “sort out the masking,” they’re usually trying to make progress quickly so they can get back to their real job. That leads to decisions like:

Shuffling data around, masking parts of strings, setting all rows to the same value
Reusing old logic in their masking scripts
Copying and pasting scripts and forgetting to update all of them

The challenge is that databases keep evolving – new columns appear, new data types get introduced, the business adds new use cases – and unless masking scripts evolve at the same pace, they drift out of date quickly and quietly.

And then maybe the person who wrote them moves to another team. Or leaves the organization. Suddenly no one knows why something was masked a certain way or what the original intention was, and that’s when things get risky.

The outcome?

You end up with scripts that don’t account for new schema changes, or worse, scripts that break, aren’t a priority for fixing, and are not run in the future. And at that point, you’re effectively restoring production data into development with no protection at all – which dramatically increases the risk of regulatory penalties, or worse, reputational degradation from a data breach.

Synthetic data has its own limits

Some teams avoid masking altogether and swing to the other extreme: synthetic data.
There’s nothing wrong with synthetic data – when it’s done well. But most of the time, teams generate something small, simple, and not very realistic.

Developers then test with data that doesn’t reflect real-world patterns. Edge cases disappear. Performance bottlenecks stay hidden.

The problem isn’t the intent. It’s that generating high quality synthetic data is a craft. And most DBAs and developers don’t have the time to master that on top of everything else they’re responsible for. Both DIY approaches solve the problem for today – but rarely for tomorrow.

What a good, sustainable approach to TDM actually looks like

If we want to protect data and help teams move faster, the solution must be easy to use, flexible to handle a variety of situations, and simple enough for our staff to understand, and more importantly, maintain across time and personnel changes.

The right approach to test data should be:

1. Simple to pick up and simple to maintain

Teams change. Documentation gets lost. That is critical as comments in code are not enough. Documentation or automated processes that enable others to extend or apply the solution to new situations.

2. Smart enough to find sensitive data for you

We store sensitive data in more places than ever. No one can manually keep track anymore. Automatic detection – with the flexibility to override – is essential.

3. Capable of producing realistic, familiar data

Developers do their best work with data that “feels” real. Random strings and nonsense values slow everyone down.

4. Able to subset production safely

You don’t always need the entire production database. Often, a well-formed slice is enough – and it’s much faster to work with.

5. Built for modern databases

A solution also needs to work at scale, handling the complexities of modern databases. Varied data types, large volumes of data, referential integrity (declared or not), and a wide variety of datasets in different languages.

These are the main considerations for any solution that protects test data.

The Build vs. Buy decision

I’ll be the first to admit: you can build your own masking solution. Plenty of teams do. But much like a monitoring system, this can result in substantial ongoing investment of time and labor. Maintaining them? That’s the real challenge.

Our databases are always changing, and we often add new database platforms. Ensuring a masking solution works well across time and the entire database estate takes a great deal of effort.

That’s why a lot of organizations are moving toward paid for, vendor supported solutions. Not because they couldn’t build their own, but because:

It reduces overhead

It preserves institutional knowledge

It scales with the environment

It gives new joiners a fighting chance

It ensures compliant test data and reduces risk

We often build software because we can’t buy something that works in the way our business works. However, as we’ve just walked through, for some tasks – like masking test data in a compliant way – a paid solution is better for several reasons, and Redgate has just the thing.

Where Redgate Test Data Manager fits in

Redgate Test Data Manager was built for exactly these challenges. It helps teams:

Discover sensitive data automatically
Mask it with realistic values
Retain referential integrity
Subset intelligently
Reduce PII exposure
Scale with changes in the database estate
Ensures that compliance is built in and not a bottleneck

If you want to see how it fits into your workflow, download the free trial and see it in action. It’s quick to set up, easy to learn, and takes the burden of DIY off your team’s plate.

FREE TRIAL

Try Redgate Test Data Manager

Get compliant test data in minutes not months.

Start your free trial

Tools in this post

Redgate Test Data Manager

Reliable and secure test data provisioning

Find out more

Redgate Blog