Product articles
Flyway
A Simple Example of Flyway Development…

A Simple Example of Flyway Development using GitHub Branching

This article demonstrates one way to do branch-based database development with Flyway, using GitHub to manage the branches and Flyway configuration files to allow Flyway to switch smoothly between databases, when we move between branches in GitHub.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

We’ll do a Flyway development simply, using GitHub to manage the branching. No PowerShell: just the Flyway command line.

In my previous article on branching and merging for Flyway projects, I explained how to manage it all independently of any particular version control system. We have one database per branch. All developers contributing to a branch use the same database. Each branch within the Flyway project is essentially a “sub-project”, with its own Flyway configuration file, identifying the database for that branch, and Migrations folder. I prefer to do it this way because it makes merging a lot easier.

In this article, I’ll follow a similar scheme, with one shared database per branch, but this time show how to manage the branching in GitHub. To make this work, we need a simple way of letting Flyway know the current project, branch and database. We’ll need to arrange that the correct connection is specified to Flyway whenever we switch between branches in GitHub. Flyway will need to know the server, database name, database user and connection details so that the person logged into the workstation will be correctly identified within the database that is used for the branch.

Of course, we still have the task of merging. In my experience, no version control system can help much when merging a branch of a relational database development. My previous article (linked above) and Branching and Merging in Database Development using Flyway describe some of the techniques that can make merges less error-prone, such as tracking object change history to investigate differences between branches and using clones for branch database provisioning and pre-merge testing.

Managing branch switching and database connection details

Flyway, as a command-line app, allows the user to put credentials and connection strings into any of three standard flyway.conf files. It also provides a much better, and more flexible, way of providing these details, either as environment variables or command-line parameters.

The project-based config file will have project-based information such as placeholders so must go into source control and be shared. However, it isn’t such a good idea to have any connection information in GitHub, especially where the user identifier can be baked into the connection string of the database. As well as the obvious security problem, how would each co-worker be able to separately manage their database UserIDs and passwords? They must avoid sharing either.

We can’t postpone dealing with this problem of storing connection information, even with this simple demonstration, because the combination of user, and server represents an open goal for an intrusion. The server and database are team-level configuration information items, of course, and need to be archived locally, but each co-worker on a branch will need a separate database user, a connection string and a password. They may even want, in addition to the branch database that represents the current state of the work, at least one separate individual copy of the database for doing test work.

One option is to supply the config key/value for the URL, user, and password as environment variables, or command-line strings. However, here we’ll opt to keep these ‘secrets’ in one of or more extra flyway.conf files in the user area of the workstation, ideally with the password encrypted. The problem that we must get around is that the flyway.conf file in the user area allows only one user to be specified, whereas it is likely that we’d want, or be required to use, a different user and password for each server and a different database for every active branch of every Flyway project.

All this boils down to a requirement to store multiple flyway.conf files in the user area of each developer’s workstation, at least one per GitHub branch for each project. We then need a mechanism that allows GitHub to indicate to Flyway the current project and ‘active’ branch, so that when we run Flyway it can locate the required config file, containing the connection details of the branch database for that project.

Aha, you think, the solution is simple. Within the project-level flyway.conf file, stored in GitHub, we use the configFiles parameter to supply the path to the user-level config file whose name includes the project and branch name. As it is considered by Flyway to be a relative reference to the current working directory, it will, you’d think, happily read configuration such as the URL and user from one file, or a list of files, from this user profile folder. We could then simply alter the project flyway.conf file in the current working directory, within each GitHub branch, so that Flyway uses the correct the server, database, user and password for that branch.

For example, the name of the user-level config file for the Develop branch of the Pubs Flyway project would be Pubs_Develop. The configFiles entry in the flyway.conf file in the current working directory of the Develop branch in GitHub would look like this:

This solution is great in theory, but reality gets in the way, because Flyway doesn’t currently allow the configFiles parameter in a config file. This is, I guess, because of a fear of the danger of endless recursion through config files, but this is a problem that can be avoided in code.

To make the idea work, we must modify this solution, either to specify the extra config file as a parameter at the command line…

…or, with less effort because it is permanent for the entire session of the console, we would set the environment variable just at the start of the session:

The disadvantage of this approach, compared to being able to use the configFiles parameter in the conf file, is that we must manually ‘reset’ the environment variable for the session, each time we switch branches. I do this by storing the above command for each branch, within a batch file in the working directory of that branch. We simply execute the batch file each time we switch branches.

The only drawback to this whole system is that, if a database is changed to another server or database name, the change would have to be made to this file by the user, but this doesn’t happen frequently.

These extra config files can be used in my Flyway Teamwork framework as well. If Flyway Teamwork detects an extra config file in the user profile folder for the current project and branch, it will automatically read it in, get the values in it, and add it to the Environment variable for Flyway to ingest (I’ll demo this in another article).

Let’s do it!

I’ve provided a demo project on GitHub at FlywayPubsDemo. This project is designed for trying out GitHub-based branching with Flyway projects in SQL Server. It is based on the old Pubs database, published by Sybase as their demo database. It was designed to be a mock database for book distributers. I’ve added a few things to it, in various stages of development.

First, simply clone the project to your local system. Next, in your user profile folder on the workstation, you prepare a set of config files, each one designed to allow the current user of the workstation to access the server/database for a particular branch of the GitHub project. Our project is Pubs, and we currently have Main and Develop branches, plus a Search feature sub-branch off Develop:

additional user-level flyway configuration files

Within the conf file for each branch, we need to specify the server, database, database username and password. This example is for SQL Server:

Because these files are in your user profile, there is a reasonable measure of OS-level security. I always advise encryption too on Windows workstations, because of Windows vulnerability to hard-drive attacks, but this requires scripting (my Flyway Teamwork PowerShell framework has a built-in task called $FetchAnyRequiredPasswords that does this for you).

Whenever we change branch, and before running Flyway, we need to run the batch file that sets the environment variable for that branch, to inform Flyway where to find the user-specific flyway.conf file. In the case of windows command-line, we simply call the file:

Now, with the project in place, and the credentials in place in the user profile folder, we can now migrate all three branch databases, using the code in the github project.

Deploying the main branch

We start with the database for main.

The main branch in GitHub

We run the batch file then execute Flyway info:

(pathToTheRepository)>call runme.bat
(pathToTheRepository)>set FLYWAY_CONFIG_FILES=C:\Users\phil\PubsMain.conf
(pathToTheRepository)>flyway info
Schema version: << Empty Schema >>
+-----------+---------+----------------------------+------+--------------+---------+----------+
| Category  | Version | Description                | Type | Installed On | State   | Undoable |
+-----------+---------+----------------------------+------+--------------+---------+----------+
| Versioned | 1.1     | FirstRelease               | SQL  |              | Pending | Yes      |
| Versioned | 1.2     | SecondRelease1-1-3to1-1-11 | SQL  |              | Pending | No       |
+-----------+---------+----------------------------+------+--------------+---------+----------+

OK, we can see that main branch has just two migration files. The first release builds the classic Pubs database, and the second release consolidated many migrations into a single script, including some of the development work from the Develop branch, up to version 11. This is the release that we then deploy.

(pathToTheRepository)>flyway migrate
Successfully validated 3 migrations (execution time 00:00.635s)
Creating schema [classic] ...
Creating schema [ people ] ...
Creating Schema History table [PubsMain].[dbo].[flyway_schema_history] ...
 ….blah! blah!… (many lines deleted)
Successfully applied 2 migrations to schema [dbo], now at version v1.2 (execution time 00:28.337s)
Executing SQL callback: afterMigrate - Add Version EP
WARNING: DB: Recording the database's version number - 1.2 (SQL State: S0001 - Error Code: 0)
Executing SQL callback: afterMigrate - ApplyTableDescriptions
WARNING: DB: Adding the descriptions for all tables and columns (SQL State: S0001 - Error Code: 0)

(pathToTheRepository)>

This all worked and the target database for the main branch is now at v1.2.

Building the Develop branch

In GitHub, we switch to the Develop branch:

The Develop branch in GitHub

With a change of branch, we need to reset the environment variable:

(pathToTheRepository)>call runme.bat
(pathToTheRepository)>set FLYWAY_CONFIG_FILES=C:\Users\andre\PubsDevelop.conf

Now we can run Flyway info:

(pathToTheRepository)>flyway info
Schema version: << Empty Schema >>
+-----------+---------+-----------------------------+------+--------------+---------+----------+
| Category  | Version | Description                 | Type | Installed On | State   | Undoable |
+-----------+---------+-----------------------------+------+--------------+---------+----------+
| Versioned | 1.1.1   | Initial Build               | SQL  |              | Pending | Yes      |
| Versioned | 1.1.2   | Pubs Original Data          | SQL  |              | Pending | Yes      |
| Versioned | 1.1.3   | UseNVarcharetc              | SQL  |              | Pending | Yes      |
| Versioned | 1.1.4   | RenameConstraintsAdd tables | SQL  |              | Pending | Yes      |
| Versioned | 1.1.5   | Add New Data                | SQL  |              | Pending | Yes      |
| Versioned | 1.1.6   | Add Tags                    | SQL  |              | Pending | Yes      |
| Versioned | 1.1.7   | Add Indexes                 | SQL  |              | Pending | Yes      |
| Versioned | 1.1.8   | AddEditions                 | SQL  |              | Pending | Yes      |
| Versioned | 1.1.9   | AddconditionalVersion       | SQL  |              | Ignored | Yes      |
| Versioned | 1.1.10  | AddAddressesPhonesEtc       | SQL  |              | Pending | Yes      |
| Versioned | 1.1.11  | AddProcedureWithTest        | SQL  |              | Pending | Yes      |
| Versioned | 1.1.12  | AddTestData                 | SQL  |              | Pending | Yes      |
| Versioned | 1.1.13  | Accounting                  | SQL  |              | Pending | Yes      |
| Versioned | 1.1.14  | AccountingData              | SQL  |              | Pending | Yes      |
+-----------+---------+-----------------------------+------+--------------+---------+----------+

You can see the individual migrations, up to v1.1.11, that were consolidated into the first two releases in main. You can also see additional migrations to load test data then build and populate an accounting package.

Let’s now we migrate the empty database for the Develop branch up to V1.1.14 (I took out some of Flyway’s verbosity here):

(pathToTheRepository)>flyway migrate
Successfully validated 28 migrations (execution time 00:01.428s)
Creating schema [classic] ...
Creating schema [ people ] ...
Creating schema [accounting] ...
Creating Schema History table [PubsDevelop].[dbo].[flyway_schema_history] ...
Migrating schema [dbo] to version "1.1.1 - Initial Build"
Migrating schema [dbo] to version "1.1.2 - Pubs Original Data"
Migrating schema [dbo] to version "1.1.3 - UseNVarcharetc"
Migrating schema [dbo] to version "1.1.4 - RenameConstraintsAdd tables"
Migrating schema [dbo] to version "1.1.5 - Add New Data"
Migrating schema [dbo] to version "1.1.6 - Add Tags"
Migrating schema [dbo] to version "1.1.7 - Add Indexes"
Migrating schema [dbo] to version "1.1.8 - AddEditions"
Migrating schema [dbo] to version "1.1.10 - AddAddressesPhonesEtc"
Migrating schema [dbo] to version "1.1.11 - AddProcedureWithTest"
Migrating schema [dbo] to version "1.1.12 - AddTestData"
Migrating schema [dbo] to version "1.1.13 - Accounting"
Migrating schema [dbo] to version "1.1.14 - AccountingData"
Successfully applied 13 migrations to schema [dbo], now at version v1.1.14 (execution time 02:36.685s)
Executing SQL callback: afterMigrate - Add Version EP
Recording the database's version number - 1.1.14 
Executing SQL callback: afterMigrate - ApplyTableDescriptions
Adding the descriptions for all tables and columns

Working on a feature branch

So finally, yes you guessed, we’ll make sure that the Search feature branch is up and running:

A feature branch in GitHub

Again, we reset the environment variable so that it points to the user-level conf file for the Search branch:

(pathToTheRepository)>call runme.bat
(pathToTheRepository)>set FLYWAY_CONFIG_FILES=C:\Users\phil\PubsSearch.conf

Now we run the info command for the Search branch database:

(pathToTheRepository)>flyway info
…Blah! Blah! …
Schema version: << Empty Schema >>
+-----------+----------+----------------------+------+--------------+---------+----------+
| Category  | Version  | Description          | Type | Installed On | State   | Undoable |
+-----------+----------+----------------------+------+--------------+---------+----------+
| Versioned | 1.1.12   | toSearchBranch       | SQL  |              | Pending | No       |
| Versioned | 1.1.12.1 | Search Functionality | SQL  |              | Pending | Yes      |
+-----------+----------+----------------------+------+--------------+---------+----------+

At this point, you’ll see that the versions conflict with what is in Develop. This is because the Search branch was created before the Accounting branch, but the latter was then merged in first, with the version numbers that were appropriate at the time. The joy of branching is that you can park features until they are wanted for release.

The version numbers can be specific to the branch but when merging is done, the resulting migration is condensed into a migration that accounts for the current state of the Develop branch and is appropriately ‘versioned’ for the that branch. It has to be this way, because you may need to make other changes to the Develop branch of the database to accommodate the feature within the feature branch.

Now we build the Search branch database:

(pathToTheRepository)>flyway migrate
Successfully validated 3 migrations (execution time 00:00.544s)
Creating schema [classic] ...
Creating schema [ people ] ...
Creating schema [accounting] ...
Creating Schema History table [PubsSearch].[dbo].[flyway_schema_history] ...
Current version of schema [dbo]: null
Migrating schema [dbo] to version "1.1.12 - toSearchBranch"
Migrating schema [dbo] to version "1.1.12.1 - Search Functionality"
Successfully applied 2 migrations to schema [dbo], now at version v1.1.12.1 (execution time 03:42.993s)

Now we see of all is done:

(pathToTheRepository)>flyway info
Schema version: 1.1.12.1
+-----------+----------+------------------------------+--------+---------+----------+
| Category  | Version  | Description                  | Type   | State   | Undoable |
+-----------+----------+------------------------------+--------+---------+----------+
|           |          | << Flyway Schema Creation >> | SCHEMA | Success |          |
| Versioned | 1.1.12   | toSearchBranch               | SQL    | Success | No       |
| Versioned | 1.1.12.1 | Search Functionality         | SQL    | Success | Yes      |
+-----------+----------+------------------------------+--------+---------+----------+

Conclusions

This article demonstrates the use of Flyway, together with GitHub to do branching. It is, of course, trivial, except for the problem of allowing us to switch databases in sync with the code for the branches. I consider Github-based branching to be less helpful at the point of merge. Once the database reaches a certain size, merging a SQL branch is, in my experience, a task that requires simultaneous access to the databases of both branches.

Unlike for an application, you cannot generally switch between databases just by using Git (SQLite is an exception because it is purely file-based). Relational databases are permanent resources and so need to be treated somewhat differently. They also have the complication of requiring connection information, and credentials. Each branch requires its own database.

Do I recommend using Git in this way? I’ve learned to be neutral on the specific brand of source control, or the best method of using it. It’s not ‘my department’. It is up to the individual development team to use a methodology that suits the development and the team, and here Flyway Community and Flyway Teams does well because it doesn’t insist on a particular workflow but aims to support the way you work.

 

Tools in this post

Flyway

DevOps for the Database

Find out more