Database Deployment: The Bits – Database Version Drift

When you are about to deploy a new version of a database by updating the current version, one of the essential pre-deployment checks is to make sure that the version in production is exactly what it should be. If changes have somehow slipped in, you'll need to understand them and deal with them before you can deploy.

Redgate DLM Automation, which features in this article, is now part of SQL Change Automation. Find out more

DevOps, Continuous Delivery & Database Lifecycle Management
Automated Deployment

Version-drift (or Version-creep) is the ironic name given to the habit of ‘patching’ the production database to make any type of change, without deploying it properly. The change could be as innocent as adding or changing an index to aid performance, but you can’t take that for granted. If you find changes in a production system that weren’t there when the version was deployed, they need to be investigated.

At the simplest level, when we detect version-drift, we are discovering that the version in production is not the version we think it is.

If we don’t check for version-drift, then it is likely that beneficial changes can be lost, or undesirable changes can be retained unintentionally: The exact consequences depend on the methodology we’re using for a deployment. If there has been drift, we need to find out why and take action to ensure that nothing gets lost, be it changes to indexes, configuration settings, or forensic evidence of fraud.

When deploying a database as a data-tier application, we might get a warning message about version-drift when we attempt the deployment. The safest way to check for version-drift is to compare the metadata before each deployment. Are the tables, routines, settings, static data, indexes, and so on, in production the same as in the archived version, usually represented by the scripts in source-control? I’ll show you how to do this.

Why is drift a bad thing?

I’ve made plenty of changes to a live system in the past: They’re called deployments; it is what they’re for, and there is a correct way of preparing for them and  doing them.  It should be easy to do the right thing. Version-drift shouldn’t happen, but it does. If the deployment process wasn’t so irksome, then the temptation to circumvent it, and ‘patch’ in a change wouldn’t be so strong. If you are finding a lot of version-drift then there is something wrong with the deployment system, either its design or implementation. If drift has happened, then we must delay deployment whilst we put it right.

If we are creating the deployment script by running a Synchronization tool, and press on without correcting the drift, we will retain stuff that maybe shouldn’t be there, whereas if we do it from source control, we will lose stuff that perhaps ought to be there. In either case, there is a chance that something will break. The only way to correct drift is to look at the changes, and ‘sort out the sheep from the goats’; in other words, keep the changes that should have been deployed properly, and discard the alterations that were caused by humanity’s imperfect state.

There is no sense in having an automated process to preserve unexpected changes since not all changes prove to be desirable and we need to inspect all of them before deciding how to deal with them. After all, while the unexpected change could be a genuine and effective change to improve performance, it could equally be part of a fraud, or an attempt to compromise the system’s security. Even if unintentional, the mindless thrashings of a DBA in a muck-sweat can result in a change that leaves a security vulnerability, or a performance time-bomb, in the database. We must treat all unexpected changes as suspect until we prove them otherwise.

When there is an auditing system in place for a production system, then it is unlikely that you’ll hit version drift, because you’ll know of any change. Many shops will have an automated alarm that detects an unscheduled metadata change in a production system which then rings the big red bell on the wall, similar to the ones like you see in the Bond films. (Okay, it sends an email alert). In any event, it is part of a good database intrusion-detection system. We can and should, of course, trap and record any DDL changes as they happen and keep a log of them, as part of an auditing system. Depending on the method of doing this, it can come with a cost in extra resources, though In the case of Extended Events this is isn’t significant.

How does drift happen?

When circumventing the deployment process in order to make a change to a live database system, we make a judgment that the risks of not making the change are greater than the risks of doing so, and hit the button. It could be that the production system isn’t important enough to matter. It could be that the proper deployment process is horribly broken, or that the person in charge of deployment is a power-crazed cabbage. One of the downsides of teamwork is that we learn to tolerate the failings of others, so we manage the change retrospectively. We can’t really assume that we can simply blitz all such database changes. We just have to accept that legitimate version-drift happens. It is frustrating at times, but one can feel a twinge of sympathy with anyone who prefers to go home in good time rather than rolling up their sleeves, updating the same version in Source Control, installing it on the integration server, running the automated integration tests, and then doing the deployment.

Version-Drift and Database Deployment

We need to check for ‘uncontrolled’ version drift before any deployment. If the metadata check reveals that the scripts in Source Control reflect what is in production and they are at the same version, then we breathe a sigh of relief. If there has been version-drift, without the changes going into Source Control, then our day starts to get more complicated.

Some differences, such as replication metadata, are added to the production server automatically on deployment and should be ignored. We can also ignore bogus differences arising from machine-generated names in the SQL Server metadata for ‘anonymous’ keys.

Other modifications, however, we can’t ignore. We need to review all the changes. If we’re in luck, the developer will have documented the change in the live database, via extended properties or headers. If we need to keep the changes then we need to update source control and rerun the pre-deployment automated tests before attempting any deployment.

If we run a deployment script, and the current version is different to the one on which we’ve rehearsed the deployment, then a number of things can go wrong. The ‘ad-hoc’ changes could be lost, the script could fail and roll-back, or the database could be caught in an intermediate state and crash. The migration script could crash.

Checking for Version-Drift

So, what’s required?

  • The Catch-points to prevent a faulty deployment.
    Any database deployment process needs to  stop with an error if it detects that the metadata the database it is upgrading  is different to what is in source control for that same version.  Both Microsoft’s DAC and  Redgate’s DLM Automation will do this automatically. In the case of DAC, a DAC-registered database has stored with it on the same server an XML version of the metadata for that version, along with its version number. The process compares this with what is actually installed. If it is not the same  then it blows the whistle on the deployment. DLM Automation can be set to either check directly with the version in Source Control, or a ‘snapshot’ of the correct version.
  • The means of working out how to remedy the problem
    We need to be able to check for changes.  If a change has happened, we want to know what has changed, why, and who did it.. To work out the whole sequence of changes and unpick them, we need to see the whole history of changes.  There are several ways we can do this. If we are quick enough in spotting a change, we can pull most of this information out from the Default Trace. We can set up a DDL trigger to record all changes and store them in a log. For large production shops, we can use Service Broker to centralize the recording of DDL changes to all our databases.

We need to inspect all ‘uncontrolled’ changes. This process needs a human. Changes need to be understood. This requires eye-balling. We need to look at the differences between what’s on the live system and what source Control thinks is in this version.  Unless you have caught these changes via a system such as a DDL trigger, and you’ve not been in time to get them from the default trace, you will need to do this is by means of a schema-comparison tool. You can compare databases and synchronize them in Visual Studio, of course, but  it probably isn’t a good idea to do this against a production server! You can use DAC, either by using SQLPackage, or by automating the DAC library in PowerShell, to generate a build script that, effectively, lists the changes between the production system and the DACPC that was used to build it.  However, a schema-comparison tool with a user-interface will give you a side-by-side rendering of the metadata in the production database, and the metadata as represented by the version in source control, or the build from it on the integration server. You can then scroll through all the changes rapidly and  get a quick appreciation of any ‘drift’.

I generally don’t even touch the production system, but look at a recent backup set from it. I also compare directly with what’s in source control via a Scripts Folder. If  you haven’t yet discovered SQL Compare, then you’re in for a happy surprise, because SQL Compare does this fast and elegantly. You don’t even need a database, but can compare a scripts folder directly with a recent backup set.

With a schema-comparison tool, the process is straightforward. You just kick off a new project if it is the first time you’re doing the comparison, or select an existing one. You then specify whether you wish to use a database, backup, snapshot or scripts folder for the  database you want to check, and fill in the appropriate details. You do the same thing for the database you want to compare it against and away you go. You don’t need the advanced synchronization features: only metadata comparison, and you can navigate through the source by the type of database object and get immediate information about what database objects have changed, and how many of each.

If you’re looking to automate the process of drift detection, another approach to consider is to use Redgate’s recently released DLM Dashboard, it is designed to be usable in the simplest possible way. It allows you to  track every change to your database schemas.  As soon as a change occurs, DLM Dashboard will alert you to the details of who, what, and how of the database schema changes, via email notifications and a clear dashboard overview, and provide you with a history of the changes.

Rolling your own solution

One of the cutest  and easiest ways of detecting version drift is to use DACFx. This will tell you if a database has drifted but it will not tell you who did it, or give you a history of the changes. It also will tell you what objects have changed but won’t tell you what it is withing the object that has changed. It is designed just to tell you that a new version can be deployed because the installed database is at the version that you expect it to be, and nobody has slipped in any changes in the meantime.  To do this, the database has to be registered, which means that its metadata at the point of registration is stored in a table in MSDB, along with the version number.

Here is a simple test routine in PowerShell that will register a database if it is not yet registered, create a new stored procedure, and then check that the DAC can detect the change

This will give the result

This is a good start, and as it was a simple addition, it won’t take much investigation. However, if it is a modification to a complex database routine or table, and if you have a comparison tool such as SQL Compare, it is easy to dive in and see what modifications have been made. If you don’t have access to a database-schema comparison tool, then you can still do it, but in a rather more longwinded way. You use a differencing tool on the two source files. A Text comparison tool is part of every programmer’s toolkit, even if only used when embedded into your source control system. There are a few snags that I’ve come across with using this method: You need to script the live system, and compare it with an integration database (one built from the source in source control for a particular version), the script-generation process is surprisingly slow, and the comparison is purely text-based  rather than being semantic.


Hah! The foreign keys are unchecked in the clone

Using a Programmers File-Comparison Tool

WinMerge is on SourceForge, published under a GNU GPL. It is free and open-source. It is probably the most popular of the many tools that show the difference between two text files. There are plenty of others such as ExamDiff and Beyond Compare. None of these will help us much with an ordinary build script, such as the ones generated from SSMS, or are embedded in DACPACs (Model.SQL), because we need the order of all the objects to be the same in both scripts. SSMS will sort the tables in dependency order, but you’ll soon find that this can result in any number of orders for identical databases. We have to resort to SMO to get the compare script we want, which is merely a sequenced collection of all the individual scripts. Remember that, unlike a proper database comparison tool, a file comparison tool has no knowledge of semantic, only text differences: it doesn’t understand SQL!

To generate the scripts, we need to build the integration or staging server to the version in source control and compare this to an exact copy of what is in production, by restoring a backup. By happy coincidence, a test restore is something we need to do anyway, occasionally. We can then run a script that will do something like this. (I use PowerShell, but C#, F# or IronPython is just as good)

A Watched kettle never boils

Just don’t hang about, watching the screen, because little furry mammals evolve in the time this script takes to run; AdventureWorks takes two minutes locally. No, there are definitely good reasons for having SQL Comparison tools. However, with advance planning, we can prepare all the scripts overnight.

Of course, for a great deal of the database, it is much quicker to get the script from the metadata, via SQL, but the build-script for tables and the related DRI aren’t stored within the database metadata. We can create a SQL routine to do our table-build script but to get everything we need is going to be immensely complex. It is also possible to pull down the metadata to a local database but the devil is in the detail.

One way of speeding the process in C# is to increase the number of threads to an optimum of around six (thanks to Linchi Shea for discovering this), or in PowerShell we can run the scripts for each database simultaneously.

My solution would be to shrug and plan this in advance, running this on a scheduler.


Before any deployment, it is worth checking that the version of the database in production is the same as the version you think is there. If all is well, then we will know exactly what is in the production database since, for all practical purposes, it will be what’s in source control for the same version. If we’ve suffered ‘Version-Creep’ , and find there are differences, then we need to understand why these changes have happened and decide if they need to be made permanent by retrospectively legitimizing them in source control.

DevOps, Continuous Delivery & Database Lifecycle Management
Go to the Simple Talk library to find more articles, or visit for more information on the benefits of extending DevOps practices to SQL Server databases.