20 April 2026

How to find the PII hiding in your database

Column names lie about what's inside. Redgate's AI Classification scans your actual data to catch PII that metadata rules miss, all locally.

Guest post

This is a guest post from James Hemson.

You’ve masked your database. You’re confident you’ve covered all your PII. But not all of it lives where you’d expect it to.

AI Classification, Redgate's data scanning capability, automatically catches sensitive data before it reaches your developers, your test environments, or your auditors.

The column name problem

Classification tools work when columns are named for what’s in them, for example when they’re called FirstName and DateOfBirth. They don’t work with names like ‘col_7’, which actually contains dates of birth, or the ‘Notes’ column where someone's been pasting customer home addresses for years.

This isn't carelessness, it's the normal state of any database that has been developed over time. Schemas are built by different people, often without documentation. Columns get repurposed. Fields get renamed. Systems get inherited. Over time, the gap between what the columns are called and what they have in them grows.

That gap is what GDPR, SOX, and HIPAA auditors uncover. It’s the difference between what you thought you masked and what you actually masked.

What AI classification does differently

Redgate Test Data Manager already classifies columns using metadata like names and data types. AI classification, adds an extra layer of data discovery and classification on top. It samples your actual data and uses machine learning to identify what each column contains. A column full of values that look like home addresses gets flagged as addresses, whether it's called ‘Address’ or ‘misc_field_3’.

Column name rules cover the tidy parts of your schema. Data scanning with AI classification covers the rest.

AI classification vs column-name classification

	Column-name classification	AI classification (data scanning)
Identifies FirstName, DateOfBirth	Yes	Yes
Identifies col_7 containing dates of birth	No	Yes
Identifies Notes containing pasted addresses	No	Yes
Requires correct column naming	Yes	No
Runs locally, no cloud dependency	Yes	Yes

AI that runs in your environment – Your data stays yours

AI classification runs the ML models inside your environment. Your data stays on your network throughout, with no external API calls and nothing in transit. For teams in financial services, healthcare, or government, local execution is usually a procurement requirement before any tool that touches sensitive data under GDPR, PCI-DSS, or HIPAA can be approved.

From discovery to protection

When Redgate Test Data Manager identifies sensitive data, it recommends a masking rule that preserves referential integrity across your schema. You review, adjust where needed, and start masking. No spreadsheet exports or manual column mapping.

Find and protect the sensitive data your column names miss. Try Redgate Test Data Manager.

Frequently asked questions

How do I automatically detect sensitive data in my database? AI classification in Redgate Test Data Manager scans your actual data to find sensitive columns based on content, not column names. It catches what metadata rules miss.

Can I detect PII without sending data to the cloud? Yes. AI classification runs ML models locally inside your environment. No data leaves your network.

What happens if PII reaches a test environment? A data breach in a test environment carries the same regulatory exposure as one in production. GDPR fines apply regardless of whether the environment was intended for testing. AI classification finds the data your column-name rules miss before that data reaches developers.

How do I mask my production database for testing? Redgate Test Data Manager classifies your schema using column metadata, then AI classification scans your actual data to catch columns that metadata rules miss. Once classification is complete, TDM applies compliant data masking rules that preserve referential integrity across your tables.

Does AI classification replace existing classification rules? No. It works on top of metadata classification. Your existing rules still apply. Data scanning with AI classification adds a second layer that catches what metadata rules miss.

What is referential integrity in data masking? Referential integrity means masked values stay consistent across related tables. If a name appears in five tables, TDM masks it to the same replacement value everywhere, so joins and foreign keys still work.

Free trial

Try Redgate Test Data Manager

Get compliant test data in minutes not months.

Start your free trial

Tools in this post

Redgate Test Data Manager

Reliable and secure test data provisioning

Find out more

Loading comments...

Security and compliance

Database monitoring and observability

Database change management

Productivity and workflow automation

AI data readiness

Database modernization

Efficiency and cost optimization

Cloud migration and workload optimization

Redgate Flyway

Redgate Monitor

SQL Toolbelt Essentials

Redgate Test Data Manager

Redgate Blog

How to find the PII hiding in your database

Guest post

The column name problem

What AI classification does differently

AI classification vs column-name classification

AI that runs in your environment – Your data stays yours

From discovery to protection

Frequently asked questions

Free trial

Try Redgate Test Data Manager

Tools in this post

Redgate Test Data Manager

2026 State of the Database Landscape

Introduction to PostgreSQL for the data professional

Security and compliance

Database monitoring and observability

Database change management

Productivity and workflow automation

AI data readiness

Database modernization

Efficiency and cost optimization

Cloud migration and workload optimization

Redgate Flyway

Redgate Monitor

SQL Toolbelt Essentials

Redgate Test Data Manager

How to find the PII hiding in your database

Guest post

The column name problem

What AI classification does differently

AI classification vs column-name classification

AI that runs in your environment – Your data stays yours

From discovery to protection

Frequently asked questions

Free trial

Try Redgate Test Data Manager

Tools in this post

Redgate Test Data Manager

You may also like

Cookies on red-gate.com