But the Database Worked in Development! Preventing Broken Constraints

The series so far:

You’re working in development, releasing a new version of the application. You’ve temporarily disabled constraints in the new version of the database, you’ve imported the current data, your hand is poised to enable constraints. Is it going to spring errors at you? What if it does?

Or, what if you synchronize a source with a target by changing the metadata, only to find that the process has an error saying a constraint, index or foreign key can’t be created?

How could this happen? What has happened in both cases is that the data does not comply with the constraints. Maybe you’ve added or altered some constraints, and you have tested them on a different version of the data, changing this data slightly to satisfy the new constraints but without providing your changes as a data migration script. Perhaps you’ve only tested the new or altered constraints using an artificial data set, with no troublesome duplicates or broken references? Possibly, you are working in development and staging is done by a different team. Deployment and release are often, from necessity, done by a different team to development. You release the new database version to staging, which perhaps is managed by a different team, and when they try to update the existing database to the new build, the build breaks.

Often, at this point, one or other of the teams will need to repeatedly try releasing while fixing all the bad data, one error message at a time. This isn’t going to help team bonding; you need a better way of dealing with these problems.

In this series of articles, I’ll provide a more reliable way to be able to tease out all the duplicates, the broken foreign key references or all the values that will cause CHECK constraint errors, before you run a deployment. It will not only report where the errors would occur, but which data in which tables would cause which constraints to fail.

Dealing with Data that Breaks the Build

Sometimes a database build breaks only when you try to release it to staging or, worse, to production. Up to that point everything is sweetness and light. Then you either synchronise or try to load the data. What has happened? Well, it could be that in the new version of the databases, you’ve added DEFAULT constraints, FOREIGN KEYs or unique indexes, as you should, or modified existing keys and constraints to enforce referential integrity, and do a better job at preventing ‘bad data’ from creeping into the database.

If you are very fortunate, you will have access to the real data, so you can enable constraints one at a time and, when one causes an error, fix the data that is recorded in the error, and develop the scripts to heal the data in a pre-deployment script in order to remove the duplicates and bad data and fix the broken FOREIGN KEY references that initially inspired you to do the work.

There’s nothing wrong with that sort of data migration script, of course, although it requires a lot of careful work, gathering up all the de-duplicating statements, making them idempotent, testing them, and then using them in a pre-deployment script for a release. Even after all this, new data may have been inserted into Production, since you took the backup for testing, which might still break the build. If you are importing the existing data into a new build, then you would disable all constraints, import the data, heal it in-situ with the data-migration script and then finally enable constraints.

What, though, if you can’t access the production data, for several good reasons? In this case, you’re most likely testing with an artificially generated set of data for testing or a masked copy of the production data. However, unless you are one of the wild men of IT, it isn’t the actual production data, and so doesn’t have all the real data’s failings. Now, either the Ops team will need to try to fix all the duplicates and bad data, until the build succeeds, or they will pass the baton back to you.

Unless you enjoy buying all the drinks on the Friday afternoon DevOps team meeting, you must never deliberately break the database build or release. What you need to do is to provide a way that the Ops people can check the data of the target database beforehand to make sure it will pass all the checks done by constraints and unique indexes, and that all the FOREIGN KEY references are in place. If it doesn’t, and this is the key difference, they will need a report of what data failed the new constraints, so that you can provide scripts to allow DevOps people to fix the data.

When you know what is failing, you can create the deduplication, broken FOREIGN KEY reference or data-sanitizing scripts. Then you re-test until you get a clean bill of health. Although I can’t really help with the actual de-duplication, I can, in this article, help you to generate the list of the constraints that will fail and why.

Duplicates

Duplicates are like rats: they get in unless you take active steps to stop them. Data just seems to want to reproduce itself. The grey-muzzled database developer will take elaborate steps to check for duplicates everywhere by using unique constraints, whilst the cub developers snigger amongst themselves at how impossible it would be for a duplicate to get in there anyway. The blighters always insinuate themselves wherever there are no checks against them. Duplicates, I mean, rather than cub developers. This means that every database revision seems to have more uniqueness checks, inspired by the labour of teasing duplicates out and stamping on them. If the production data has duplicates, this must be fixed before you can release new unique constraints successfully to that target database.

Constraints and Bad Data

However often I go on about CHECK constraints, there will always be a developer who will leave them out or mutter in a dignified manner about how all checks need to be done only at the application level. This attitude soon gets divine retribution. Bad data springs up like a rotting fungus over your database unless you add CHECK constraints to all your tables. This is fine but then how do you prevent the excellent and estimable habit of adding them to then interfere with a release? The constraints will stop the build if they meet bad data: it is what they are trained to do. If you don’t like that, then you must fix the bad data first.

Unreferenced Foreign Keys

These are less frequent, but I’ve seen them. What happens here is that you import your data, enable constraints, and you get a message about a foreign key reference. This happens if, for example, the name of a country that is referenced in an address list or currency table is missing. Getting data right in a referenced table can be like picking up lots of tadpoles and putting them in a jar. Unless you know what failed, this can be very tricky to fix.

Checking the CHECK Constraints: First Principles

We’ll show how to do a check. The whole point of this type of test is that it must report the breakages in enough detail that you can smilingly pass to the Ops Guy a script that will heal it.

Let’s do the very simplest check: a check of constraints. We’ll do this by assembling a batch as a string. This batch will execute every constraint in the database, on the table to which it belongs, and tot up the grand total of rows that failed a constraint. (All code samples in this series can be found on GitHub.)

DECLARE @AllTheFailures INT; --tally of all the

--failures in the constraints

DECLARE @CheckAllYourConstraints NVARCHAR(MAX) =

'Select @RowsFailed =0;

SELECT @CheckAllYourConstraints =

@CheckAllYourConstraints --accumulate each query

+ N'

Select @RowsFailed=@RowsFailed+count(*) from ' --the table spec

+ QuoteName(Object_Schema_Name(CC.parent_object_id)) + N'.'

+ QuoteName(Object_Name(CC.parent_object_id)) + N' WHERE NOT '

+ definition

FROM sys.check_constraints AS CC

WHERE is_ms_shipped = 0;

--Now we have a list of select queries that will accumulate

--the total number of rows that fail the condition

EXECUTE sp_executesql @CheckAllYourConstraints,

N'@RowsFailed int output',

@RowsFailed = @AllTheFailures OUTPUT;

SELECT @AllTheFailures;

We run this in our test copy of AdventureWorks2016. It returns no rows, because all the constraints are enabled and so the rows are already well-policed by those constraints if they are in a ‘trusted’ state.

The batch that was executed was this:

Select @RowsFailed =0;

Select @RowsFailed=@RowsFailed+count(*) from [Person].[Person]

WHERE NOT ([EmailPromotion]>=(0) AND [EmailPromotion]<=(2))

Select @RowsFailed=@RowsFailed+count(*) from [Sales].[SalesTaxRate]

WHERE NOT ([TaxType]>=(1) AND [TaxType]<=(3))

Select @RowsFailed=@RowsFailed+count(*) from [Sales].[SalesTerritory]

WHERE NOT ([SalesYTD]>=(0.00))

Select @RowsFailed=@RowsFailed+count(*) from [Sales].[SalesTerritory]

WHERE NOT ([SalesLastYear]>=(0.00))

Select @RowsFailed=@RowsFailed+count(*) from [Production].[Product]

WHERE NOT ([SafetyStockLevel]>(0))

Select @RowsFailed=@RowsFailed+count(*) from [Sales].[SalesTerritory]

WHERE NOT ([CostYTD]>=(0.00))

…and so on.

Now we’ll mangle our copy of AdventureWorks2016. Don’t worry, since it is a clone maintained by SQL Clone, I can do what I like and refresh it when I want it back to its pristine state. We’ll disable a constraint and alter the data so that when I try to reenable it, it will fail. Dave the Dev of AdventureWorks has decided that the MaritalStatus code in the employee table needs more options than M for married or S for single. What about P for Partner? He disables the CHECK constraint, CK_Employee_MaritalStatus.

1 2	ALTER TABLE humanResources.employee NOCHECK CONSTRAINT CK_Employee_MaritalStatus

Then he makes the necessary changes.