Product articles Flyway Database Branching and Merging
Using a GitHub Tagged Release for a…

Using a GitHub Tagged Release for a Flyway Migration

Why not just build the latest version of any branch of the database by pulling the scripts from the latest tagged release on GitHub? While it is easy to get the files via the GitHub site, it gets tedious to do so repeatedly, via the GUI. It is, however, possible to automate this via the Rest API, using a script. If you are using PowerShell, I've done it for you.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

If you can automate the task of downloading the latest branch version, or a specific release, from GitHub repository, then you can maintain the source just one in this single location and just grab the files whenever you need to do a build. This is a good way of preparing a database release candidate, regardless of how you develop your database.

The Flyway migration system

There are plenty of ways to create a database, but the most common are to use either a build script or a chain of migration scripts. You can use an object-level directory too, if you have SQL Compare for your RDBMS, or if have a manifest that tells you the correct order of building the objects that make up the database (see, for example, Recording What’s Changed when Running Oracle Migrations with Flyway).

To provision a new database, or update an existing database to the new version, using Flyway, all you would need to do, on your local system, is execute the necessary chain of migration scripts, in the correct order. However, normally, more than this is required. For most database build or migration systems, including Flyway, you will need four different sources of information:

  1. Personal security credentials– to access the server as a user with sufficient authority to build the database, which may require details such as UserID and Password, Oracle wallet, GitHub credentials, or sometimes even a security key that is based in the workstation. These are never archived and are usually kept securely.
  2. Connection information – to make a connection with the Server, so you can log in with your credentials and access the RDBMS. Connection information is usually stored separately, as a team resource.
  3. Project information – such as values for placeholders, preferences, database configurations. In the case of Flyway, this is stored as a configuration (.conf) file.
  4. The code – to build or migrate the database, stored in source control.

Here we’ll concentrate on the fourth source, the code. It is the most obvious and important part of the solution. I’ve covered several techniques for maintaining secure credentials and database connection information in other articles, though my general approach, with my Teamwork framework, is to store them encrypted in a separate .conf file within the user area.

Flyway and Source control

Once Flyway has successfully used a migration file to create a new version of a database then it will prevent you from retrospectively tampering with that file, using checksums. With Flyway, every new ‘versioning’ migration file represents a new version. The previous files are unchanging, so the latest set of scripts will allow you to recreate any previous version.

A set of migration scripts, once successfully used to build a database, can only be changed, for that database, by using Flyway undo scripts to revert to an earlier version, and then redoing the subsequent migrations, or by cleaning out the entire database and redoing the entire migration run.

You might wonder, therefore, why you’d want to store a set of such immutable, unchanging, files in source control, a tool that assumes that all the components of a build are subject to change, which it will then track for you. Basically, it is because not every Flyway file is immutable! Whereas the forward migration files (prefix V) are immutable for a particular database once used, nothing much else is. The undo (prefix U) files aren’t immutable, for example: if they are incorrect, we need to change them. There are also script-based migration files and callback script that may need to be altered, after first use, such as to correct a bug.

Also, we can, and do, store a build script for each version of the database, an object-level directory, perhaps a JSON object model and anything else we need for development and debugging work. You might assume that something like an object model is unlikely to change for any version and so would be immutable, but things occasionally happen to prove you wrong. What, for example, if a particular object type (such as, maybe, Oracle’s ATTRIBUTE DIMENSION, HIERARCHY or ANALYTIC VIEW) is missing from an object model, and needs ‘back-filling’?

Only the V files will define a database version, and so are the obvious place for holding the database version, but this only produces the identical database version if everything else that is associated with a database version is stored with it. The contents of ‘Repeatable’ (R) migration files, for example, may or may not change between versions. They aren’t even executed in a migration run unless you change them. These R files must be present for every migration run, whether they’ve changed or not.

The larger the team, the more complications can ensue from making any such retrospective change, so they can only be done safely when working in an isolated branch. Because of the possibility of a retrospective change to any file that isn’t a V file, it is usually best to use the latest copy of the collection of migration scripts to build the current version of the database for the branch, or whatever previous version of it that you want. In addition, you can migrate an existing version to a later release. If you have Flyway Teams and the relevant undo files, you can migrate to a previous release.

Flyway and Git

With a conventional build-based database development, a build becomes frozen in time, as a release, at the point that we attach a version number to it, as a tag, in the source control system. In Git, a tagged release becomes the database version.

The tags that are used by Git are just markers that can be placed at any specific commit in a Git repository’s history, and they are often used to identify important points in the development timeline, such as releases or milestones. Once a tag has been created, it can be used to refer to a specific point in the project’s history, making it easy to share and collaborate on a particular version of the code.

This feature allows developers to create a release by associating a tag with a particular set of commits. It is particularly useful because it makes it easy to add additional information such as release notes, binary files, and other documentation. Developers can this use feature to track changes between versions and to download the specific version they need, including source files, documentation, and any support materials.

Using GitHub Tags with Flyway

A GitHub-tagged release is a specific version of any GitHub project that has been marked with a tag in the Git repository. With a conventional build, your GitHub tag would correspond to the version that you wish to build, whereas, with Flyway you can build any version covered by the migration chain or migrate from an existing version to it. Basically, you’ll almost always want the latest version of the migration chain.

This means, for example, that even if you only wish to build V1.2, you’d probably want to take the set of migration files from the V1.6 tagged release. This is because each new version of a Flyway project is likely to contain other bits of information that, while they aren’t needed for the actual build of that version, are important for the development process. As we’ve discussed, other types of files, such as undo or repeatable, can affect the code.

The way that you use source control is a team decision, and depends on how you’re structuring your Flyway projects, how you’re managing branches, in the first place. I like to download the latest tagged release of a branch of a Flyway project as a set of version files that use the ‘V’ prefix, and undo files (‘U’ prefix) for the migration, and, in a separate directory, the callback script files. I use this as the location for Flyway. I then use the version tag to retrieve the rest of the ephemera, which would include the version build script. This is what I’ll demonstrate, shortly, in my Get-TaggedGitRelease script.

GitHub uses a semantic version number whereas Flyway uses a four-number version. By representing the Flyway Version within the semantic version, it would allow us to update the contents of a version, and re-tag it, while maintaining the Flyway version. We could, in effect, get the latest tag marked with a particular Flyway version, for example, 1.1.3.

This whole system only works because I never cheat the determinism of the system by running repeatable (R) migrations, or by creating migrations, SQL or scripted, that do different things depending on a placeholder value. There is also the problem of callbacks that can affect the database metadata version and might be present at some versions but not others. All I can say is that Flyway was designed to keep things simple as long as you obey the spirit of the rules. If you game the system, all bets are off.

Using GitHub’s Rest API to fetch GitHub tags

GitHub uses the standard REST API for its application in order to allow different software systems to easily communicate with GitHub over the internet. With this API, you can get the contents of a tagged release by name, or you can simply get the latest release. You can also get a list of the existing tags and sort them to choose the tag you need, and then select the contents of that tagged release.

I’ve written a PowerShell cmdlet, Get-TaggedGitRelease, that sorts it all out for you. To use this, you first need a GitHub ‘Personal access token’ to allow you to use the REST API. This authenticates you as a GitHub user, allowing you to access public-access repos. To get it:

  1. Sign into your GitHub account
  2. Click on your profile icon on the top right corner
  3. Select Settings from the dropdown menu
  4. Click on Developer settings from the left menu
  5. Select Personal access tokens from the left menu.

With this token, the PowerShell process can make authenticated requests to the GitHub REST API on behalf of your GitHub account. For public projects, the token’s permissions should be set according to the level of access you need. The first time you use the cmdlet, or when you change it, you need to provide your ‘Personal access token’ as a parameter. The Cmdlet then stores it in your user home folder. If you don’t provide one, the cmdlet uses the last one you used. If you do, it overwrites any stored one.

Build a named version from a tagged release

Before I reveal the code for the Get-TaggedGitRelease cmdlet, let’s see an example of how we’d use it. I’ve created a sample project called GithubDemo so you can test out the API. In this simple example, we’ll just set the Flyway credentials using environment variables:

And now we can use the cmdlet to download a GitHub-tagged release (V1.2) and then build that version of the database. It gets the tagged release into to a local FlywayGitHubDemo directory (by default within the user area) and then uses it to migrate a Pubs database to the tagged version.

You’ll notice, if you use the sample project on GitHub, that you also get downloaded a ‘state’ directory that provides you with the object model, object level source, build script, manifest, reports of the changes in the version and so on. These are generated by Teamwork as long as you are using one of the more common flavors of relational database. This ‘state’ directory is relevant to the version of the project you’ve specified. You can, with Flyway, build to any other version of the database covered by the migration run.

That’s all there is to it, in its simplest form, although obviously we can do more. To switch to a branch is fine. You just add the ‘commit-ish’ to the -tag parameter, but you need to change the Server and database to the one assigned to the branch. This may involve a change to the database credentials too.

The Get-TaggedGitRelease cmdlet

Finally, here is a PowerShell routine, called Get-TaggedGitRelease, that gets a tagged release of a Flyway project from GitHub, via the REST API.

In its simplest form, you just need to specify the name of the repository and the name of the version you want, which most of the time will be the latest version:

Since we didn’t specify, it will, by default, put the retrieved files in a directory in your user area, with the same name as the project, and retrieve all filetypes (but you can configure these as you require using the -TargetFolder and -Filespec parameters). The first time you use it, you’ll also need to specify in the -credentials parameter your personal access token.

If instead you want a specified version, or the version before the latest version, just change the -tag:

The header of the code gives more details of usage:

Other joys of the REST API

Although raw GIT commands are fine with PowerShell, the REST API has particular value when you need to transmit information. Most data is returned as JSON documents and GitHub is almost obsessive in the way it packs in this information. I’m fine with extending the information-gathering about a project, such as examining the details of the current branches, the uncommitted files and so on. My own feeling is that the committing of code is too manual a process, what with inspecting the changes, testing, and sifting out the mistakes.

Conclusion

This model is a useful way and neat of working, particularly of working will suit a branch of a database where all code is committed before the database is built. How, though, would one test code before that stage? By the time I commit code to project source control, it’s been well tested and thought through. I’ve done timings, checked alternatives and submitted the work to peer-reviews. It could be that I’m a dinosaur in an age of tame creatures, but when on an isolated branch of a database, I need elbow room to work quickly. If you were to restrict this sort of system to just the development branch, you’d have to have two different systems in place. The important point with Flyway is that you get a lot of freedom in the way that you work. If your team prefers to work this way, then Flyway is fine with it!

Tools in this post

Flyway

DevOps for the Database

Find out more