Flyway Baselines and Consolidations

Phil Factor demonstrates why we occasionally need to 'baseline' a database, when automating database deployments with Flyway, and a simple way of reducing the number of migration files that are required for a build.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

In a previous article, Getting Started with Flyway and SQL Server, I showed how to use Flyway to produce the classic chain of migration files to build any version of the database that you require, complete with data. Here, I’ll introduce two techniques: First, doing a baseline of an existing installation of the database that wasn’t built using Flyway, to make it upgradable by Flyway. Second, showing how to condense, or consolidate,  migrations that you no longer need into a single initial script.

> For a ‘learning tree’ showing other articles in this series, see: Flyway Content Map.

Baselining an existing database

When you present an empty database to Flyway, it will use the Scripts folder to apply the first script in the chain of versioned files in the directory, followed by the others in version order, and thereby build successive versions of the database from them. Flyway proceeds until it reaches the version you request. As it works through the chain of migrations, it updates the history of this chain of actions to the database, in schema history table of the target database, along with the version no. and checksum of each migration file in the directory.

In other words, Flyway holds all the information it needs for a migration within the target database, in the Flyway schema history table, not the Scripts folder. It does this so that it can safely update any number of databases at different versions using just one Scripts directory. If you change the connection, and use another blank database, it will repeat the process. If you present an older version of the database, Flyway will check the schema history table for its version and then update the database, applying all scripts at a later version.

If this Flyway schema history table doesn’t yet exist in a copy of the database, it can be placed in it retrospectively by creating a ‘baseline’. In Flyway, a baseline is simply an entry in the history table, with a version that tells Flyway the point from which to start all subsequent migrations. The baselining process doesn’t create any scripts or other files in the Scripts directory.

Imagine that AdventureWorks is your current production database. You are selling bicycles and parts. Very sensibly, you have recently adopted Flyway to automate your database deployments, and you have added the source of the database (a build script) to a Flyway Scripts directory as v1.3.1 and added a v1.3.2 migration with an extra view requested by sales for listing the top ten best performing salespeople.

You test the build and migration on an empty target database. It all went well, and it has been tested and approved. You’re ready to release. As it is a production database, you will be doing a migration on the live system, with all its changed data. However, when you try it, you get this error:

Flyway : ERROR: Found non-empty schema(s) [dbo] but no schema history table.
Use baseline() or set baselineOnMigrate to true to initialize the schema history table.

You suddenly realize that your production system isn’t yet known to Flyway, and so Flyway doesn’t know what version it is at. To migrate an existing database, Flyway needs to get the current version of the target, from the schema history table, and it doesn’t exist. Flyway will refuse to execute against any existing, ‘un-versioned’ database that has any schemas that have objects in them, but no Flyway schema history table.

To get around this, you can apply a version to the database, either by ‘baselining’ the database or by setting the config switch BaselineOnMigrate to true to initialize the Flyway schema history table. It is rare to need to baseline a database, but it is occasionally a useful way of correcting this sort of error.

So, we baseline this (imaginary) production database of AdventureWorks, previously untouched by Flyway, and tell it that it’s at v1.3.1. Whenever you execute the Flyway Baseline action, you provide a baseline version using the -baselineVersion parameter. This baseline version, which you can also name with a baselineDescription, will be tagged as being the “baseline” and this will be where any subsequent migration starts.

Let’s try all this out. First, we’ll run the deployment on an empty AdWorks database. I’ve provided a script in the GitHub project, CreateAdworksDatabase.sql, to do this. The Scripts directory of your project looks like this:

Two Flyway migration scripts for AdventureWorks

Now, we perform the database build and migration using the following PowerShell script. If you’re using SQL authentication, you’ll need to supply the username and password (I demonstrate a more secure way of doing this later):

We execute it and see that it has worked. We now have the revised version.

+-----------+---------+-------------------+------+---------------------+---------+
| Category  | Version | Description       | Type | Installed On        | State   |
+-----------+---------+-------------------+------+---------------------+---------+
| Versioned | 1.3.1   | Adventureworks    | SQL  | 2021-02-22 10:14:02 | Success |
| Versioned | 1.3.2   | TopTenSalesPerson | SQL  | 2021-02-22 10:14:03 | Success |
+-----------+---------+-------------------+------+---------------------+---------+

Now we want to migrate our pretend production database, AdventureWorks2016, to V1.3.2. We change the value in $database run the migration:

Oh dear. Flyway won’t allow that because there is no schema history table in the production database, so it can’t work out whether to migrate.

Flyway : ERROR: Found non-empty schema(s) [dbo] but no schema history table. Use baseline() or set baselineOnMigrate to true to initialize the schema history table.
At line:1 char:1
+ Flyway migrate @FlywayArgs
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (ERROR: Found no... history table.:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError

So, we create a baseline, as follows:

We see that it worked…

Creating Schema History table [AdventureWorks2016].[dbo].[flyway_schema_history] with baseline ...
Successfully baselined schema with version: 1.3.1

And if we run …

…we see that the baseline has been successfully created as an entry in the schema history table, and because we already have existing migrations, in the Scripts folder, Flyway flags any files with a version higher than 1.3.1 as “Pending” (and any lower than 1.3.1 as “Below Baseline”):

Baseline for a production database

So, we try the migration again:

This time we’ve succeeded, and the new version is in production:

Database Migration from baseline

Consolidating early migration scripts

The first file in any chain of migration scripts is usually, in effect, a build script because Flyway will only execute it against an empty database or, if it isn’t, one that has an existing history table that shows the database to be at a lower version number.

If you have too many migration files and the process of building a database just takes too long, you can get around this most easily by exploiting the fact that Flyway will happily execute a migration chain, if the database is empty. You simply replace, with a single build script, all the migration steps that you no longer need because they’re too old to be of interest.

Once the requirement for copies of the early versions has died down, we can archive the old scripts and replace them with just the build to the earliest version that you still might conceivably need. We’ll use our Pubs database to demonstrate this. First, we’ll set up a scripts folder with a chain of sequential migrations, using a modified form of a migration that we used in the previous article. I’ve made these files available in the GitHub project (these is also an afterMigrate script in the project that’s used in other articles to add a version stamp. Just add a ‘d’ to start of its name for this example):

SQL Migration scripts for Pubs database

You’ll see that I’ve got a file beginning with a ‘d’. That is to make Flyway ignore it for the build. Flyway recognizes ‘V’ for versioned, ‘U’ for undo, and ‘R’ for repeatable and ignores any file that starts with anything else. We’ve given the first file a ‘d’ prefix for the first character so that this file is invisible to Flyway for the time being.

Setting up Flyway command line parameters, including credentials

In the first example, we provided credentials directly, using parameters, just to keep things simple. Here, with the second PowerShell script, we’ll use parameters again, to tell Flyway what we want to achieve, but this time I’ll demonstrate a technique that allows a user who has UserName and Password credentials to keep passwords in a secure place within the user folder. If a password is not already stored, you are asked for it once.

We will create a PubsFly database to try out the migration, on the server, MyServer (you’ll change that, I hope). This database should be empty.

Running the Flyway migration

Having generated these arguments for the command line we then clean out the database and apply the migration scripts to the empty database.

And now we check that all is well:

A complete run of migrations

Condensing many migrations scripts into one

Let’s say we’ve now decided that we don’t need any version before 1.1.4. The first task then is to generate a build file for v.1.1.4. Many database systems include tools that can retrospectively generate a build script for the current database, however it was created. For SQL Server, you can use your usual SQL Editor such as SSMS (Tasks > Generate Scripts).

Similarly, MySQL has Workbench Central’s ‘Forward engineer’ wizard. Even SQLite has the .dump command for its tables. Some other databases use the term ‘export’ or ‘build’. Some of these database systems allow you to either script out or dump the data too.

It is usually easy to automate this build, possibly with an afterMigrate callback script, so that you can generate a build script for every new version, if it doesn’t yet exist.

In this example, we just need to rename our existing build script for v1.1.4, changing the ‘d’ into a ‘V’, and then archive the existing migration files that, when executed sequentially, create the same database. The new scripts folder looks like this:

Condensing Flyway migration scripts

So, we start off with an empty database and see if we can still create it from the truncated migration chain where we’ve consolidated four steps into one.

Pending database migration files

Now we can just ask Flyway to migrate it to the latest version:

Successful "rebasing" of migrations

Yes, it takes it to version six in just three migrations, starting at version 1.1.4. We’ve consolidated all the early migrations that we no longer need into just one and, in effect, “rebased” the database project to V1.1.4.

Conclusion

We’ve shown, in this article, why the ‘baseline’ action is occasionally needed in Flyway. We’ve also shown a simple way of reducing the clutter of migration files that aren’t necessary for the build. We’ve rather ducked some of the issues, such as the management of version control, but Flyway is flexible enough to deal with awkward database development problems.

We’ve also avoided the question of how to reduce the clutter of version files between releases, or at any points in the migration run other than the beginning. This is more awkward because it has to take into account how to do a migration step between two different versions while preserving any existing data. Although it is possible to do this, I’d recommend using, where possible, the technique of always rolling up the earliest migrations by combining them into build script, though a time will come that a database might grow to the point where single build script can become unmanageable.

It is unusual to get caught out with a niche requirement when using Flyway. One of the benefits of the strong community involvement in Flyway is that the tool developers get to hear about problems with edge cases, and unusual types of database that are difficult for tool designers to imagine or predict. This has resulted in a tool that bristles with configuration switches, features, and settings that allow the users to get things done easily.

Tools in this post

Flyway

DevOps for the Database

Find out more