Product articles
Flyway
Callbacks and placeholders
One Flyway Migration Script for Diverse…

One Flyway Migration Script for Diverse Database Systems

How to create a single set of SQL migration scripts for Flyway that we can use across multiple database systems, or for all regional variants of a database.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

JDBC and ODBC drivers, to be valid, must work to a restricted dialect of SQL, across databases. While this includes both DML and DDL, it is not really sufficient for building a database. These dialects are all defined and specified by the ODBC standard for drivers, and a driver can conform to minimum, core or extended SQL-92. This means that you can even run SQL Queries on text or Excel spreadsheets, or Create Excel databases. Outside this grammar, when using triggers, procedures or functions, you are in rather a wild country, but one that can be navigated cautiously. There is a lot that wasn’t specified. The trick is to keep things simple.

We’ll be using Flyway placeholders to accommodate differences in dialect between database systems, to allow us to run cross database Flyway migrations. I’ll demonstrate installing the classic sample database, Pubs, on four different database systems, by running a single shared RDBMS-agnostic set of Flyway migrations. We’ll do it for SQL Server, PostgreSQL, MariaDB and, just to show off a bit, SQLite. The code and scripts are here.

Why? Just Why?

I’ll admit that I could save time by phoning every person who needs to run the same database on four different database systems at once. However, I have a data-driven application that needs to be on three, due to the vagaries of hosting. More important are the techniques I illustrate, such as the commenting out of blocks of code. A lot of us deal with ‘variants’, where a version of a database has several different variants, such as an accounting system that must work appropriately for the financial regulations in force in the region. We use the same techniques to deal with this.

What could go wrong?

This shared SQL-92 grammar is useful for the developing polyglot databases, but you’ll soon find discrepancies. The text of the SQL within defaults, for example, can contain code that is for a particular SQL Dialect. The standard doesn’t deal with incrementing IDENTITY fields. There is no shared concept of a clustered index. SQLite just doesn’t do ALTER DATABASE, so we need to shoehorn all the constraints into the CREATE TABLE statements. SQLite is a stranger to procedures too, and there is no realistic way around that.

There are many other problems that you’ll come across. You need to deal with all these issues in Flyway by using ‘macros’ or ‘placeholders’. Every database system has the same placeholder keys or names, but the value of a placeholder will be unique to the database system to which you are connecting.

Supporting multiple SQL dialects in a single set of migration scripts

To achieve a single source of migration scripts to build the same database for four different relational database systems we will need to use a special migration source code that has placeholders that are set differently for each database system.

Using Flyway placeholders

Flyway uses placeholders to allow us to specify values that can be used by any executable file. They perform the same role as ‘macros’ in traditional tools. Before Flyway executes a SQL file, it looks for placeholders and will swap them out for the run-time values before the code is executed. See: Customizing Database Deployments using Flyway Callbacks and Placeholders.

Flyway supports placeholder replacement with configurable prefixes and suffixes. These look like ${myplaceholder}. You can set the values of placeholders from the command line using parameters, from environment variables or from config files. They can be environment variables or be used via the Java API, though these latter methods of using them aren’t much use to us in PowerShell scripts. (See node-flywaydb and node-db-migrate for an interesting way of interacting with Flyway’s API).

There are some useful built-in placeholders that we can use to pass information to any SQL callback or script, such as:

  • ${flyway:defaultSchema} – the default schema for Flyway
  • ${flyway:user} – the user Flyway will use to connect to the database
  • ${flyway:database} – the name of the database from the connection url
  • ${flyway:timestamp} – the time that Flyway parsed the script, formatted as 'yyyy-MM-dd HH:mm:ss'
  • ${flyway:filename} – the filename of the current script
  • ${flyway:workingDirectory} – the user working directory as defined by the ‘user.dir’ System Property
  • ${flyway:table} – the name of the Flyway schema history table

Building dialect dictionaries

To proceed, we need to build up a dialect dictionary for each RDBMS with values for all the placeholders we need to account for the vagaries of supporting different SQL dialects.

We’ll create our own custom placeholders for each database system and place them in a dialect array. This can then be used by asking Flyway to read in the values for that particular database system.

I’ve used the good old Pubs database from days of old. It has procedures, views and triggers as well as tables. It also has user-defined types, but I haven’t attempted to deal with these as only PostgreSQL and SQL Server recognize them. I’ve added the constraints but unfortunately, had to reduce the power of the CHECK constraint, because only SQL Server recognizes the regex syntax that allows you to specify a character range or character collection using [] brackets. I had to give up on triggers for this project because they were all so different in character.

SQLite doesn’t have an ALTER TABLE statement which causes problems. You need to save the old data in a temporary table, create a new table under a new name, fill it from the temporary table, delete the old table and the temporary table and finally rename the new table to the old! Fortunately, this exercise doesn’t require ALTER TABLE but beware! This is too major a change to be fixed with a placeholder. We avoided the need for ALTER TABLE by packing all the definition of constraints and defaults into the CREATE TABLE statement and by importing the data in the correct dependency order. I should have chosen a beefier relational database system than SQLite, but then it is rather cute. The other problem is schemas. Flyway depends on schemas, but MariaDB and MySQL don’t have them.

We use the following placeholders:

  • ‘currentDateTime‘ – the function used to give the current date
  • ‘schemaPrefix’ – the schema prefix to use. (e.g., dbo) MariaDB is a stranger to schemas.
  • ‘CLOB‘ – the data type you use for text of indeterminate large size
  • ‘BLOB’ – the data type you use for binary data of large size
  • ‘autoIncrement’ – how you specify an identity column
  • ‘DateDatatype’ – the datatype for storing both date and time
  • ‘hexValueStart’ – how you declare hex-based binary data
  • ‘hexValueEnd’ – how you terminate hex-based binary data
  • ‘arg’ – the prefix for the variables that you use as parameters for procedures (SQL Server uses the @ prefix)
  • ‘viewStart’ – the code to start the creation of a view
  • ‘viewFinish’ – the code to finish the creation of a view
  • ‘procStart’ – the initial string for a procedure
  • ’emptyProcArgs’ – how you denote that you have no args/parameters for creating a procedure
  • ‘procBegin’ – the preliminary string for a procedure
  • ‘procEnd’ – the termination string for a procedure
  • ‘procFinish’ – the final string for a procedure

Each database system seems to have a different way of doing these things, but with these placeholders set appropriately, one can get a long way in developing a simple database.

Here are some example placeholders for MariaDB:

-placeholders.hexValueStart="decode('"
-placeholders.dateDatatype="DateTime"
-placeholders.procFinish="//     DELIMITER ;"
-placeholders.arg=""
-placeholders.viewFinish=""
-placeholders.autoIncrement="int NOT NULL auto_increment primary key"
-placeholders.procBegin="Begin"
-placeholders.BLOB="longblob"
-placeholders.emptyProcArgs="()"
-placeholders.procStart="DELIMITER //"
-placeholders.schemaPrefix=""
-placeholders.hexValueEnd="','hex')"
-placeholders.currentDateTime="CURDATE()"
-placeholders.viewStart=""
-placeholders.procEnd="end;"
-placeholders.CLOB="longtext"

Automating Flyway migrations across multiples relational database systems

It is very satisfying seeing four copies of the Pubs database being built on four entirely different relational database systems in under a minute, while one sips a cocktail or sherry. Flyway keeps individual record in the flyway_schema_history table of each database of what migrations have been applied and when. As a result, it doesn’t matter if things get out of sync with successfully applied migrations, and therefore version numbers, when it hits an error on just one of the four databases. If Flyway must do something for a particular database, it learns about it from the schema history table and, if necessary, does it. If it doesn’t, it just shrugs and passes on to the next.

If you want to play along, you can grab a copy of my PubsAgnostic GitHub project. To make this even easier, you can use my PowerShell cmdlet called Get-FilesFromRepo, which you can get from here via cut-n-paste. Having loaded the cmdlet, just run the following code to create a local copy of the project. I’ve put it in a temporary directory (see the $ProjectFolder variable below) but you can change this to wherever you want to store it. For example, you might want to create your own GitHub project and store it there.

The Scripts subfolder in this project contains two Flyway migration scripts that we’ll execute, in PowerShell, to build and then fill a copy of the classic Pubs database on four different database systems: SQL Server, PostgreSQL, MariaDB and SQLite (if you need to get up speed on the basics of running Flyway migrations on each of these systems, the links will take you to the right articles).

The work of automating these migrations is done by my Create-FlyWayParametersets cmdlet (PolyglotFlyway.ps1). It accepts an array of placeholders ($FlywayArray), one placeholder for each database system. From this, it creates parameter sets that it can then use to connect to each system in turn and run the migration scripts. It will also produce config files if you prefer that.

The placeholder array

We pass as a parameter to the Create-FlyWayParametersets cmdlet, a data structure ($FlywayArray) that holds all our placeholders. We have four different database systems with a development database on them. We therefore need four objects in our array.

Once you’ve done all this, it is just a matter of steering to the right directory and keying in the basic Flyway command and action.

Passing in this array as a parameter, in PowerShell, is only one way to do it. Another way is to specify the placeholder values for each database system in its own in Flyway config file (Flyway has an infinite appetite for these files), and then specify the path to the config file for that specific database type as a parameter to Flyway. You can also save your database details and project data as config files.

I use PowerShell to speed up development and because it helps with bookkeeping (such as ensuring that all placeholders are defined if you need to introduce some more), which I’m not good at doing.

You can, of course, keep this array as a JSON file. If you want to save or edit these as JSON, each JSON document will look like this:

I’ve put the complete file here in FlyWayJsonPlaceholders.json

Creating parameter sets from placeholders

The Create-FlyWayParametersets PowerShell cmdlet does the processing for you and produces an array of parameters for Flyway. If you just want to develop one database, you specify the name of the RDBMS in the -WhichToDo optional parameter.

Running the polyglot migrations

With all this in place, we simply call the Create-FlyWayParametersets supplying the path our project folder, and a project name sand description. It runs the same set of migrations on each of the databases and tames the output of Flyway so that one can do rapid database development work.

Summary

In this article, I’ve demonstrated the power of placeholders in Flyway. To illustrate this, I’ve created a project that builds the old Pubs database to four different Relational database systems, despite several differences in syntax. This use of placeholders extends further, of course. There are plenty of occasions when a version of a database must deal with variants: you may need slightly different databases of the same version to be compatible with different applications, different legislative frameworks or for doing performance tests.

Hopefully, this demonstration will have given you idea of how to get around the complications that often face database projects that operate strict versioning.

Tools in this post

Flyway

DevOps for the Database

Find out more