Cross-RDBMS Version Checks in Flyway

How does one check that a database is definitively at the version that Flyway says it is? Or that a test teardown procedure leaves no trace in the database? Or verify that an undo script returns a database's metadata to that state it should be in for the version to which you're rolling back? This article shows how to do high-level version checks, by comparing JSON models.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Let’s say someone sneaks in and makes a few direct changes to a database, such as altering a column, adding an index, or adjusting a constraint. Flyway has no independent means of detecting this problem, so how do we guard against it? If you’re a SQL Server or Oracle database developer, you’ll reach instinctively for SQL Compare. It has become such an intrinsic part of the workflow for these databases that it is hardly necessary to spell out the many reasons for its usefulness, although this article provides a good list. It is very handy for the ‘whack-a-mole’ game of ensuring that nobody unofficially changes a database from its designated version, since it can detect any drift in the current production database, just by comparing to the scripts directory in source control. I’ve already shown how SQL Compare helps instill this discipline into a Flyway development.

However, this and many other routine tasks often need to be done for a whole range of different databases systems that SQL Compare can’t currently work with. There is still a place for a cross-RDBMS way of achieving some of the tasks for which one instinctively reaches for SQL Compare.

Without SQL Compare, we cannot generate a script, or compare a script with a live database, but there are cross-RDBMs ways to do simple checks to find out what changed in a database. We won’t get the same level of detail that a schema comparison tool can provide, but sometimes it’s enough just to do a high-level comparison between a version of a database and the previous version to see what the migration has changed (which tables were created or dropped, or which columns tweaked). Just occasionally, we want to see if someone has snuck in a change to a database since we did the migration. Perhaps they’ve absent-mindedly gone in there and added an index: some people occasionally do that.

Cross-RDBMS version checks using JSON database models

How would you go about comparing a database as it exists now, with what it was when you originally did the migration to that version? Fortunately, my Flyway Teamwork framework allows you to save a JSON model of the database after each migration. You can then use this model as a definition of the version to check if any sneaky changes have been made since you did the migration. It is much easier to use for this purpose than source code, though the framework can save source code as well.

The model is generated in two different versions, a single file that maps all the objects in the database, and a directory that has a JSON file for each database object such as a table, procedure, view or function. These are in subdirectories according to the type of object. By comparing two different models of the same object, a source and a target, you can tell if it has been dropped, altered or added since the source model was generated.

In Flyway, there are several extra uses for these models.

Test ‘teardown’

When you are devising unit tests and integration tests that can be done as part of a migration, or within a callback, you will likely need to create a temporary table of some sort, possibly a function and quite possibly a procedure or two.

When you’re devising these tests, you need to be certain that no traces of any of these objects remains. Once the test is torn down, you need to check that the database is still identical to the version that resulted from the migration.

Verifying UNDOs

You also will probably want to create UNDO files for each migration, especially in a feature branch. You need to check that the UNDO file that you handcraft returns the database to the version that you specify.

Demonstrating a version check

Hopefully, you’re already using Flyway Teams with my Flyway Teamwork PowerShell framework, so already have a database at a particular version, with models saved for this and every previous version. In fact, all you need for detecting ‘drift’ is the latest version with a model, but you might also like to try out the comparisons between several different versions.

To play along, you might find the following script useful. It laboriously cleans the database and runs every migration in the series for the branch of the project in which you’re working. I often need to do this repeatedly, for test purposes, when I make a change to the framework. For every successful Flyway migration, it invokes various tasks included with my framework, including $SaveDatabaseModelIfNecessary, which generates the JSON models.

In the demos for this article, I’m using the Develop branch of my Pubs project, for SQL Server, but I’ve provided similar projects for other RDBMSs, and the following code works for every RDBMS I have:

With this I just set the context to the flyway project and branch and run it. Most of the code is just displaying a string that tells me whether I’m working on the right branch of the correct project.

Phil is processing the develop branch of the sqlserver Pubs project using PubsDev database on Philf01 server with user Phil Factor"

Then, if you run the $RunMigrationsWithTasks scriptblock…

…you’ll eventually get the feedback from all the migrations (the above code is designed for testing, so it batches this feedback and spouts it out at the end. This behavior is the same for individual tasks.).

The first thing to do is to check what the database thinks is the current version by reading the version in the Flyway Schema history table. We’ll just make sure that this version is in the $dbDetails hashtable.

Executed GetCurrentVersion
in GetCurrentVersion, current version is 1.1.12, previous 1.1.11.

Uncontrolled database changes

We’ve now done a perfectly good series of migrations. Let’s now spoil the last one by making some “uncontrolled’ changes, adding an object and dropping a couple of them. My framework includes a convenient Execute-SQLStatement cmdlet for executing this sort of ad-hoc SQL (don’t get too used to doing this with a real database!):

Creating a temporary model for the current database

Now, we need to create a new model for the “drifted” database, with these changes in it, to check that we can detect them. We’ll store these ad-hoc models in TEMP since we don’t want to keep them around.

First, we remove any existing temporary models:

We then create the ad-hoc models (database- and object-level) using the $SaveDatabaseModelIfNecessary task, which always checks to see if it is already done and otherwise refuses to do it.

Technical aside: Framework tasks can be persuaded to take extra parameters for one-off use, but you can’t chain them because providing each task its own unique parameters is too wearying!

Performing the version check by comparing database models

At this point, we have a model for the current Flyway version (V1.1.12, in my case) to tell us what should be in the database for that version, and we’ve then sneakily made changes to this database and created a temporary model for it. Can we detect the ‘uncontrolled’ changes?

Firstly, we’ll just try loading the complete models and see whether we can detect the changes by using the Diff-Objects cmdlet:

Be aware that this method isn’t going to scale well: even with this small sample database, it took a few seconds. SQL Compare would have told you this a lot faster.

The $Comparison object contains details of the differences detected. It can easily be converted into a report that makes even more obvious sense.

Ref                                            Source           Target                Match
---                                            ------           ------                -----
PubsDev.dbo.table.DeleteMePlease               (PSCustomObject)  <-   
PubsDev.dbo.table.DeleteMePlease.columns       MyColumn int      <-   
PubsDev.dbo.stored procedure.reptq3                             (PSCustomObject)      ->   
PubsDev.dbo.stored procedure.reptq3.columns[0]                  @lolimit Dollars      ->   
PubsDev.dbo.stored procedure.reptq3.columns[1]                  @hilimit Dollars      ->   
PubsDev.dbo.stored procedure.reptq3.columns[2]                  @type char(12)        ->   
PubsDev.dbo.view.titleview                                      (PSCustomObject)      ->   
PubsDev.dbo.view.titleview.columns[0]                           title nvarchar(255)   ->   
PubsDev.dbo.view.titleview.columns[1]                           au_ord tinyint        ->   
PubsDev.dbo.view.titleview.columns[2]                           au_lname nvarchar(80) ->   
PubsDev.dbo.view.titleview.columns[3]                           price money           ->   
PubsDev.dbo.view.titleview.columns[4]                           ytd_sales int         ->   
PubsDev.dbo.view.titleview.columns[5]                           pub_id char(8)        ->

Performing the version check by comparing object-level models

That was slow, a few seconds, but we filled the time happily by walking the cat and staring out of the window. However, there is a much quicker way of doing this. Luckily, we also store the model at the object level. There are subdirectories for each type of database object. Each subdirectory has a separate file for each object.

If the content of an object-level file is different (the file hash has changed), between versions, then we can be confident that the object has changed. This was done for source control, but it is easy to use for comparing databases.

We’ve demonstrated the technique by writing warnings for every change

WARNING: view dbo.titleview has been dropped
WARNING: stored procedure dbo.reptq3 has been dropped
WARNING: table dbo.DeleteMePlease has been added

Naturally, you can refine this to take all the files that have differences in the hash and seeing how they’ve changed. This will show up changes in constraints, columns and indexes. You’d use Diff-Objects to do this.

Conclusion

So, we can now do a high-level comparison of two different versions of a database, for any of the major RDBMSs that SQL Compare doesn’t yet cover, using Flyway and the Flyway Teamwork Framework. It isn’t done just out of curiosity, but as a way of ensuring that a database is at the version it claims. This is useful when creating test harnesses that must be torn down, or for ensuring the efficacy of UNDO files or in creating code to rollback failed migration changes in MySQL/MariaDB. It also ensures that no accidental or unofficial changes have crept into the database. There are other uses too. For example, we could extend it to be able to determine the version of a database that has lost its flyway table.

A model is one of those features that increase in usefulness once you’ve decided to include them. They can be used for many retrospective metadata queries for previous versions of a database that are no longer live, for example. They are great for tracking the history of changes to a particular table over migrations. Debugging and attaching blame is then so much easier.

Tools in this post

Flyway Teams

Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.

Find out more

Redgate Flyway

DevOps for the Database

Find out more