Cross-RDBMS Version Checks in Flyway
How does one check that a database is definitively at the version that Flyway says it is? Or that a test teardown procedure leaves no trace in the database? Or verify that an undo script returns a database's metadata to that state it should be in for the version to which you're rolling back? This article shows how to do high-level version checks, by comparing JSON models.
Let’s say someone sneaks in and makes a few direct changes to a database, such as altering a column, adding an index, or adjusting a constraint. Flyway has no independent means of detecting this problem, so how do we guard against it? If you’re a SQL Server or Oracle database developer, you’ll reach instinctively for SQL Compare. It has become such an intrinsic part of the workflow for these databases that it is hardly necessary to spell out the many reasons for its usefulness, although this article provides a good list. It is very handy for the ‘whack-a-mole’ game of ensuring that nobody unofficially changes a database from its designated version, since it can detect any drift in the current production database, just by comparing to the scripts directory in source control. I’ve already shown how SQL Compare helps instill this discipline into a Flyway development.
However, this and many other routine tasks often need to be done for a whole range of different databases systems that SQL Compare can’t currently work with. There is still a place for a cross-RDBMS way of achieving some of the tasks for which one instinctively reaches for SQL Compare.
Without SQL Compare, we cannot generate a script, or compare a script with a live database, but there are cross-RDBMs ways to do simple checks to find out what changed in a database. We won’t get the same level of detail that a schema comparison tool can provide, but sometimes it’s enough just to do a high-level comparison between a version of a database and the previous version to see what the migration has changed (which tables were created or dropped, or which columns tweaked). Just occasionally, we want to see if someone has snuck in a change to a database since we did the migration. Perhaps they’ve absent-mindedly gone in there and added an index: some people occasionally do that.
Cross-RDBMS version checks using JSON database models
How would you go about comparing a database as it exists now, with what it was when you originally did the migration to that version? Fortunately, my Flyway Teamwork framework allows you to save a JSON model of the database after each migration. You can then use this model as a definition of the version to check if any sneaky changes have been made since you did the migration. It is much easier to use for this purpose than source code, though the framework can save source code as well.
The model is generated in two different versions, a single file that maps all the objects in the database, and a directory that has a JSON file for each database object such as a table, procedure, view or function. These are in subdirectories according to the type of object. By comparing two different models of the same object, a source and a target, you can tell if it has been dropped, altered or added since the source model was generated.
In Flyway, there are several extra uses for these models.
Test ‘teardown’
When you are devising unit tests and integration tests that can be done as part of a migration, or within a callback, you will likely need to create a temporary table of some sort, possibly a function and quite possibly a procedure or two.
When you’re devising these tests, you need to be certain that no traces of any of these objects remains. Once the test is torn down, you need to check that the database is still identical to the version that resulted from the migration.
Verifying UNDOs
You also will probably want to create UNDO files for each migration, especially in a feature branch. You need to check that the UNDO file that you handcraft returns the database to the version that you specify.
Demonstrating a version check
Hopefully, you’re already using Flyway Teams with my Flyway Teamwork PowerShell framework, so already have a database at a particular version, with models saved for this and every previous version. In fact, all you need for detecting ‘drift’ is the latest version with a model, but you might also like to try out the comparisons between several different versions.
To play along, you might find the following script useful. It laboriously cleans the database and runs every migration in the series for the branch of the project in which you’re working. I often need to do this repeatedly, for test purposes, when I make a change to the framework. For every successful Flyway migration, it invokes various tasks included with my framework, including $SaveDatabaseModelIfNecessary
, which generates the JSON models.
In the demos for this article, I’m using the Develop branch of my Pubs project, for SQL Server, but I’ve provided similar projects for other RDBMSs, and the following code works for every RDBMS I have:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
$RunMigrationsWithTasks = { Flyway info #check that you've got a connection Flyway clean #clean the existing del .\versions\*.* -Recurse #and the reports for all the versions $ExecutedWell = $true; $LastVersionDone = ''; Dir ".\$($dbDetails.migrationsPath)\V*.sql" | foreach{ [pscustomobject]@{ 'file' = $_.Name; 'version' = [version]($_.Name -ireplace '(?m:^)V(?<Version>.*)__.*', '${Version}') } } | where { (!($StartVersion -ne $null -and $_.version -lt $StartVersion) -and !($EndVersion -ne $null -and $_.version -gt $EndVersion)) } | Sort-Object -Property @{ Expression = "version"; Descending = $false } | foreach{ if ($ExecutedWell) { Flyway migrate "-target=$($_.version)"; $ExecutedWell = $? } if ($ExecutedWell) { $PostMigrationTasks = @( $GetCurrentVersion, #checks the database and gets the current version number #it does this by reading the Flyway schema history table. $GetCurrentServerVersion, #get the current version of the database server $CreateBuildScriptIfNecessary, #Create a build script for the database in a #subdirectory for this version. $CreateScriptFoldersIfNecessary #create a script folder that can be used for comparing databases $SaveDatabaseModelIfNecessary, #Build a JSON model of the database that we can #later use for comparing versions to create a chronicle of changes. $CreateVersionNarrativeIfNecessary, #save the information from the history table #about when all the changes were made and by whom $SaveFlywaySchemaHistoryIfNecessary ) Process-FlywayTasks $DBDetails $PostMigrationTasks #if ($LastVersionDone -eq $DBDetails.version) # {$ExecutedWell=$false} $LastVersionDone = $DBDetails.version } } } |
With this I just set the context to the flyway project and branch and run it. Most of the code is just displaying a string that tells me whether I’m working on the right branch of the correct project.
1 2 3 4 5 |
cd <my-Path-To>\FlywayTeamwork\Pubs\Branches\Develop . '.\preliminary.ps1' Write-Output @" $($Env:USERNAME) is processing the $($dbDetails.variant) variant of $($dbDetails.branch) branch of the $($dbDetails.RDBMS) $($dbDetails.project) project using $($dbDetails.database) database on $($dbDetails.server) server with user $($dbDetails.installedBy)" "@ |
Phil is processing the develop branch of the sqlserver Pubs project using PubsDev database on Philf01 server with user Phil Factor"
Then, if you run the $RunMigrationsWithTasks scriptblock…
1 |
$RunMigrationsWithTasks.Invoke() |
…you’ll eventually get the feedback from all the migrations (the above code is designed for testing, so it batches this feedback and spouts it out at the end. This behavior is the same for individual tasks.).
The first thing to do is to check what the database thinks is the current version by reading the version in the Flyway Schema history table. We’ll just make sure that this version is in the $dbDetails hashtable.
1 |
Process-FlywayTasks $dbDetails $GetCurrentVersion |
Executed GetCurrentVersion in GetCurrentVersion, current version is 1.1.12, previous 1.1.11.
Uncontrolled database changes
We’ve now done a perfectly good series of migrations. Let’s now spoil the last one by making some “uncontrolled’ changes, adding an object and dropping a couple of them. My framework includes a convenient Execute-SQLStatement
cmdlet for executing this sort of ad-hoc SQL (don’t get too used to doing this with a real database!):
1 2 3 |
Execute-SQLStatement $dbDetails 'DROP VIEW dbo.TITLEVIEW' Execute-SQLStatement $dbDetails 'DROP procedure reptq3' Execute-SQLStatement $dbDetails 'create table dbo.DeleteMePlease (MyColumn int)' |
Creating a temporary model for the current database
Now, we need to create a new model for the “drifted” database, with these changes in it, to check that we can detect them. We’ll store these ad-hoc models in TEMP since we don’t want to keep them around.
First, we remove any existing temporary models:
1 2 3 4 5 6 7 |
@("$env:TEMP\version.json", "$env:TEMP\currentversion.json") | foreach { if (Test-Path $_) { Remove-Item $_; Write-Output "deleting $_" } } "$env:TEMP\model" | foreach { if (Test-Path $_ -PathType Container) {Remove-Item $_ -Recurse -Force; Write-Output "deleting $_" } } |
We then create the ad-hoc models (database- and object-level) using the $SaveDatabaseModelIfNecessary
task, which always checks to see if it is already done and otherwise refuses to do it.
1 2 3 4 5 |
Process-FlywayTasks $dbDetails $SaveDatabaseModelIfNecessary @( "$env:TEMP\version.json",#$MyOutputReport "$env:TEMP\currentversion.json", #MyCurrentReport "$env:TEMP\model"#MyModelPath ) |
Performing the version check by comparing database models
At this point, we have a model for the current Flyway version (V1.1.12, in my case) to tell us what should be in the database for that version, and we’ve then sneakily made changes to this database and created a temporary model for it. Can we detect the ‘uncontrolled’ changes?
Firstly, we’ll just try loading the complete models and see whether we can detect the changes by using the Diff-Objects cmdlet:
1 2 3 4 5 6 7 8 |
$currentVersionReportPath = "$($dbDetails.reportLocation)\$($dbDetails.Version)\Reports" #see what is changed by comparing the models before and after the migration $Comparison = Diff-Objects -Parent $dbDetails.Database -depth 10 ` ([IO.File]::ReadAllText("$env:TEMP\version.json") | ConvertFrom-JSON)<#and get the current model from file#> ` ([IO.File]::ReadAllText("$currentVersionReportPath\DatabaseModel.JSON") | ConvertFrom-JSON) | where { $_.match -ne '==' } |
Be aware that this method isn’t going to scale well: even with this small sample database, it took a few seconds. SQL Compare would have told you this a lot faster.
The $Comparison
object contains details of the differences detected. It can easily be converted into a report that makes even more obvious sense.
Ref Source Target Match --- ------ ------ ----- PubsDev.dbo.table.DeleteMePlease (PSCustomObject) <- PubsDev.dbo.table.DeleteMePlease.columns MyColumn int <- PubsDev.dbo.stored procedure.reptq3 (PSCustomObject) -> PubsDev.dbo.stored procedure.reptq3.columns[0] @lolimit Dollars -> PubsDev.dbo.stored procedure.reptq3.columns[1] @hilimit Dollars -> PubsDev.dbo.stored procedure.reptq3.columns[2] @type char(12) -> PubsDev.dbo.view.titleview (PSCustomObject) -> PubsDev.dbo.view.titleview.columns[0] title nvarchar(255) -> PubsDev.dbo.view.titleview.columns[1] au_ord tinyint -> PubsDev.dbo.view.titleview.columns[2] au_lname nvarchar(80) -> PubsDev.dbo.view.titleview.columns[3] price money -> PubsDev.dbo.view.titleview.columns[4] ytd_sales int -> PubsDev.dbo.view.titleview.columns[5] pub_id char(8) ->
Performing the version check by comparing object-level models
That was slow, a few seconds, but we filled the time happily by walking the cat and staring out of the window. However, there is a much quicker way of doing this. Luckily, we also store the model at the object level. There are subdirectories for each type of database object. Each subdirectory has a separate file for each object.
If the content of an object-level file is different (the file hash has changed), between versions, then we can be confident that the object has changed. This was done for source control, but it is easy to use for comparing databases.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
<# we start by getting an array of objects that give the filepath, hash, object-name and object-type of the entire directory in our temporary model for the current state of the database #> $Latest = dir "$env:TEMP\model\*.json" -Recurse | foreach { $name = $_.Name.Replace('.json', ''); $type = $_.Directory.Name; Get-FileHash $_ } | foreach{ @{ 'Hash' = $_.Hash; 'Path' = $_.Path; 'Type' = $type; 'Name' = $name } } <# we now get an array of objects that give the filepath, hash, object-name and object-type of the model directory for the version of the database that it should be #> $current = dir "$($dbDetails.ReportLocation)\$($param1.Version)\model\*.json" -Recurse | foreach { $name = $_.Name.Replace('.json', ''); $type = $_.Directory.Name; Get-FileHash $_ } | foreach{ $Hash = $_.Hash; $Path = $_.Path; $match = $latest | where { $_.name -eq $name } if ($match -eq $null) { write-warning "$type $name has been dropped" } elseif ($match.Hash -ne $hash) # compare the hashes of the files to see if they are the same { write-warning "$type $name has been changed " }; # save this for the backward comparison! @{ 'Hash' = $Hash; 'Path' = $Path; 'Type' = $type; 'Name' = $name } } <# now we check to find all the objects that have been added #> $latest | foreach{ $name = $_.name; $Type = $_.type; $match = $current | where { $_.name -eq $name } if ($match -eq $null) { write-warning "$type $name has been added" } } |
We’ve demonstrated the technique by writing warnings for every change
WARNING: view dbo.titleview has been dropped WARNING: stored procedure dbo.reptq3 has been dropped WARNING: table dbo.DeleteMePlease has been added
Naturally, you can refine this to take all the files that have differences in the hash and seeing how they’ve changed. This will show up changes in constraints, columns and indexes. You’d use Diff-Objects
to do this.
Conclusion
So, we can now do a high-level comparison of two different versions of a database, for any of the major RDBMSs that SQL Compare doesn’t yet cover, using Flyway and the Flyway Teamwork Framework. It isn’t done just out of curiosity, but as a way of ensuring that a database is at the version it claims. This is useful when creating test harnesses that must be torn down, or for ensuring the efficacy of UNDO files or in creating code to rollback failed migration changes in MySQL/MariaDB. It also ensures that no accidental or unofficial changes have crept into the database. There are other uses too. For example, we could extend it to be able to determine the version of a database that has lost its flyway table.
A model is one of those features that increase in usefulness once you’ve decided to include them. They can be used for many retrospective metadata queries for previous versions of a database that are no longer live, for example. They are great for tracking the history of changes to a particular table over migrations. Debugging and attaching blame is then so much easier.
Tools in this post
Flyway Teams
Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.