Product articles Flyway
Getting an Overview of Changes to a…

Getting an Overview of Changes to a PostgreSQL Database using Flyway

How to use Flyway and PowerShell to automatically generate a database build script every time Flyway successfully created a new version. You can then investigate schema changes between versions simply by using a Diff tool to compare build scripts.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Imagine that you are a database developer, working on a new feature branch within the development branch of the database. You start off well enough, with a Flyway migration file that creates the development version of the database from zero (a blank database) to the point at which you branched. Then you get rather carried away trying out lots of different ideas, running some tests and checks, using a visual tool to generate code for tables, and so on. It is all going so well until a little voice inside your head says, “Hang on Buddy, this is supposed to go inside a migration”. What do you do now?

Until your work is applied by Flyway, as a migration file that provides a version number to the migrated database, your development changes are outside of Flyway’s knowledge or control. It will still report the database version as being unchanged from its original version, at the point of branching. Your job now is to understand which objects changed, and then capture those changes, retrospectively, into a Flyway versioned migration.

How you do that depends on the RDBMS that you are using, but there always a way. If you have a schema comparison tool, such as SQL Compare, the task is easy as it can compare the two versions and generate a first-cut migration script for you. However, with many RDBMSs, you’re not going to have that luxury. I’ll demonstrate how to generate build scripts for each new version of a PostgreSQL database, and then compare them to get an overview of the differences between two versions. You can then work through all the differences and write a script that does the necessary migration.

Capturing changes into a migration script

The task sounds simple. First, you need to find out which objects changed and how, by comparing your branch database – the result of your wild development efforts – to the current version that Flyway knows about. In this example, that’s the database version at point of branching. Second, you need to produce a Flyway migration file to capture those changes. Finally, you need to check that it worked. How? By making sure that the branch database built by the Flyway migration run is the same as the result of your glorious build spree. For example, build a second database, as it existed when you created your branch, apply the migration file, and then compare the resulting database to your current branch database. They should be identical.

However, if you have no database comparison tool for the RDBMS that you are using, the only obvious way to know which objects have changed and how, between two versions, is to compare build scripts. This means that we need to generate a build script on every migration, so we always have a saved record of the metadata for each version.

We need a build script for the current branch, and one for the database as it existed at the point you created the branch. As long you generate both build scripts using the same tool, and in the same way, you can then use a good Diff tool to compare the two and find out what has changed. This will give you a head start in producing the migration script, although of course it won’t produce the file for you! I’ll demonstrate this approach using PostgreSQL as an example.

With SQL Server, MySQL and Oracle, the whole process is made much simpler if you have Redgate’s Schema Compare tool. It will accept as either source or target a database or a database build script or a set object build scripts. You can configure it to generate the first cut of a Flyway-compatible migration file, and you can use it as either a user-interface or a CLI tool with the same configuration/project file. I’ll cover this in a later article.

Finding out what changed in a PostgreSQL database

If you don’t have a suitable schema comparison tool, then the simplest way of inspecting what changes you’ve made to a database, for any relational database, is to compare build scripts. It is good for the broad sweep as long as you don’t reformat the code of either build script. Diff tools can only compare text, not semantics.

I’ll illustrate this simple approach with an example that uses Flyway community, my FlywayTeamwork PowerShell framework and a PostgreSQL database. The framework includes some built-in tasks that allow us to generate a build script after every migration run, for SQLite, PostgreSQL or SQL Server. The idea is that, by doing this, you’ve always got a ‘baseline’ build script. Now you can do an ad-hoc build-script, having made a bunch of development changes to your branch database, and compare the two scripts to see what’s changed.

I’ve put a PostgreSQL version (and a SQLite version) of the Pubs database project on GitHub, in the branch-based Flyway project called FlywayTeamwork-Pubs. I’d encourage you to use this framework, for Windows PowerShell work, because it protects you from various complications. For example, I’ve already done the hard work behind the scenes, which is to persuade pg_dump and psql to do what we want.

Before you start, you’ll need to install the PostgreSQL utilities. I use Chocolatey for this sort of thing because it makes upgrading easy. You’ll also need a Pubs PostgreSQL database on the PostgreSQL server you create. I use HeidiSQL when using several database systems, but PgAdmin is, of course, the go-to IDE. I cover some of the basics of working with Flyway and PostgreSQL in Getting Started with Flyway Migrations on PostgreSQL

Flyway project configuration

In the root directory of the FlywayTeamwork-Pubs project (I described this project in more detail here: Flyway Branching Walkthrough) you’ll find a Resources folder with all the script-based tasks. You’ll need to add the paths to the pg_dump and psql utilities to the start of DatabaseBuildAndMigrateTasks.ps1 file, after the comments section. I’ve shown my paths as an example, but yours will be different!

Here is the Develop branch directory for the project:

Flyway project for a PostgreSQL database

You need to be sure that the migration files are in the /Migrations subdirectory. I’ve provided scripts to migrate the pubs database to V1.1.6. If you are re-running this, you need to delete any existing build scripts because the framework will only create them if they don’t already exist.

You may need to copy the ‘Preliminary’ file from the ‘resources’ directory at the base of the project if it doesn’t already exist. This routine gathers up all the information you need for running the PowerShell script block tasks that add functionality to Flyway and makes sure that PowerShell has compiled all the utility cmdlets you need.

The important thing to do now is to create the project-level Flyway.conf. The first four items are important, and the last four will be useful at a later stage. We will be identifying databases uniquely not only by their version but also their branch and variant.

Generating build scripts for each new Flyway version

This example uses Flyway Community, executed within a PowerShell script. Basically, we force Flyway to create every version as a separate migration run. We add a few lines of code after the migration run to call the $GetCurrentVersion task, to get the new version number, and then the $CreateBuildScriptIfNecessary to generate the build script for the new version. For PostgreSQL it generates the build script using the pg_dump utility and for SQL Server, it uses SQL Compare. For SQLite it uses Ssqlite.exe.

With Flyway Teams, you could add these tasks to an afterMigrate script callback, so you wouldn’t even need to use PowerShell directly. There is good reason for doing this automatically as part of the migration routine. You can’t create the build script for the version retrospectively because you’ve very likely made changes, via added DLL code, since the last migration.

Build scripts are a good record of a version and I usually put them in source control at the point at which Flyway does its run. This gives you another way to check changes between versions, doing them within Git.

The following example migrates to each of the six versions of the PostgreSQL database, in turn, each time creating the metadata build script for that version. We store the build script in a separate subdirectory that we can keep for comparing versions. The routine only does this if there isn’t already a build script there.

And we look in the ‘versions’ subdirectory to see that a directory has been made for each version

Build artifacts automatically generated for every new version of the database

What’s inside, you wonder? Each version directory has a scripts directory (One day, you’ll have reports and maybe models in there as well). In the Scripts directory, there is a file. Hmm, what’s this?

A build script for a PostgreSQL database

Comparing build scripts

Now that we have a ‘baseline’ build script for every version, we can generate an ad-hoc build script for our development database and compare it to its baseline build script to get an overview of what’s changed.

So, what’s changed between version 1.1.2 and version 1.1.6? Simple, we open our diff tool (I’m using Devart Code Compare) and do a file comparison to see what is different between the two build scripts.

Comparing database build scripts using a Diff tool

Now, of course, you must work through all the differences and write a script that does the necessary migration. You’ll also observe from the screengrab, though, that there are certain limitations. The file comparison has flagged up capitalization and whitespace differences that disguise the fact that there are no functional differences. You can set all good DIFF tools to ignore whitespace, capitalization, line-breaks or some comments.

Some Diff tools will compare SQL and colorize it properly and even allow you to create a merge file interactively.

Conclusions

If we have build scripts for every database version, we can compare any two versions whenever necessary. We can also compare what we’ve built with what is so far known to Flyway as a version. It also provides us with a record of changes that are easily consumed by any source control system.

It is important to know of this way of finding database differences because it is a good generic way of understanding what changes have been made to the database. It doesn’t generate a migration script, but it goes a long way down that road, especially if you have a DIFF utility that facilitates merging. To return to my original example of the database developer who needs to use a range of tools to develop a database, it becomes easier to create a migration file once you have a good DIFF tool. If it happens that objects aren’t changed, but merely appear from nowhere, then with a quick merge, you’re just about there with your migration file! Real-life database development work isn’t often like that though!

 

Tools in this post

Flyway

DevOps for the Database

Find out more