12 October 2023

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Phil Factor

12 October 2023

Using a GitHub Tagged Release for a Flyway Migration

Why not just build the latest version of any branch of the database by pulling the scripts from the latest tagged release on GitHub? While it is easy to get the files via the GitHub site, it gets tedious to do so repeatedly, via the GUI. It is, however, possible to automate this via the Rest API, using a script. If you are using PowerShell, I've done it for you.

Guest post

If you can automate the task of downloading the latest branch version, or a specific release, from GitHub repository, then you can maintain the source just one in this single location and just grab the files whenever you need to do a build. This is a good way of preparing a database release candidate, regardless of how you develop your database.

The Flyway migration system

There are plenty of ways to create a database, but the most common are to use either a build script or a chain of migration scripts. You can use an object-level directory too, if you have SQL Compare for your RDBMS, or if have a manifest that tells you the correct order of building the objects that make up the database (see, for example, Recording What’s Changed when Running Oracle Migrations with Flyway).

To provision a new database, or update an existing database to the new version, using Flyway, all you would need to do, on your local system, is execute the necessary chain of migration scripts, in the correct order. However, normally, more than this is required. For most database build or migration systems, including Flyway, you will need four different sources of information:

Personal security credentials– to access the server as a user with sufficient authority to build the database, which may require details such as UserID and Password, Oracle wallet, GitHub credentials, or sometimes even a security key that is based in the workstation. These are never archived and are usually kept securely.
Connection information – to make a connection with the Server, so you can log in with your credentials and access the RDBMS. Connection information is usually stored separately, as a team resource.
Project information – such as values for placeholders, preferences, database configurations. In the case of Flyway, this is stored as a configuration (.conf) file.
The code – to build or migrate the database, stored in source control.

Here we’ll concentrate on the fourth source, the code. It is the most obvious and important part of the solution. I’ve covered several techniques for maintaining secure credentials and database connection information in other articles, though my general approach, with my Teamwork framework, is to store them encrypted in a separate .conf file within the user area.

Flyway and Source control

Once Flyway has successfully used a migration file to create a new version of a database then it will prevent you from retrospectively tampering with that file, using checksums. With Flyway, every new ‘versioning’ migration file represents a new version. The previous files are unchanging, so the latest set of scripts will allow you to recreate any previous version.

A set of migration scripts, once successfully used to build a database, can only be changed, for that database, by using Flyway undo scripts to revert to an earlier version, and then redoing the subsequent migrations, or by cleaning out the entire database and redoing the entire migration run.

You might wonder, therefore, why you’d want to store a set of such immutable, unchanging, files in source control, a tool that assumes that all the components of a build are subject to change, which it will then track for you. Basically, it is because not every Flyway file is immutable! Whereas the forward migration files (prefix V) are immutable for a particular database once used, nothing much else is. The undo (prefix U) files aren’t immutable, for example: if they are incorrect, we need to change them. There are also script-based migration files and callback script that may need to be altered, after first use, such as to correct a bug.

Also, we can, and do, store a build script for each version of the database, an object-level directory, perhaps a JSON object model and anything else we need for development and debugging work. You might assume that something like an object model is unlikely to change for any version and so would be immutable, but things occasionally happen to prove you wrong. What, for example, if a particular object type (such as, maybe, Oracle’s ATTRIBUTE DIMENSION, HIERARCHY or ANALYTIC VIEW) is missing from an object model, and needs ‘back-filling’?

Only the V files will define a database version, and so are the obvious place for holding the database version, but this only produces the identical database version if everything else that is associated with a database version is stored with it. The contents of ‘Repeatable’ (R) migration files, for example, may or may not change between versions. They aren’t even executed in a migration run unless you change them. These R files must be present for every migration run, whether they’ve changed or not.

The larger the team, the more complications can ensue from making any such retrospective change, so they can only be done safely when working in an isolated branch. Because of the possibility of a retrospective change to any file that isn’t a V file, it is usually best to use the latest copy of the collection of migration scripts to build the current version of the database for the branch, or whatever previous version of it that you want. In addition, you can migrate an existing version to a later release. If you have Flyway Teams and the relevant undo files, you can migrate to a previous release.

Flyway and Git

With a conventional build-based database development, a build becomes frozen in time, as a release, at the point that we attach a version number to it, as a tag, in the source control system. In Git, a tagged release becomes the database version.

The tags that are used by Git are just markers that can be placed at any specific commit in a Git repository’s history, and they are often used to identify important points in the development timeline, such as releases or milestones. Once a tag has been created, it can be used to refer to a specific point in the project’s history, making it easy to share and collaborate on a particular version of the code.

This feature allows developers to create a release by associating a tag with a particular set of commits. It is particularly useful because it makes it easy to add additional information such as release notes, binary files, and other documentation. Developers can this use feature to track changes between versions and to download the specific version they need, including source files, documentation, and any support materials.

Using GitHub Tags with Flyway

A GitHub-tagged release is a specific version of any GitHub project that has been marked with a tag in the Git repository. With a conventional build, your GitHub tag would correspond to the version that you wish to build, whereas, with Flyway you can build any version covered by the migration chain or migrate from an existing version to it. Basically, you’ll almost always want the latest version of the migration chain.

This means, for example, that even if you only wish to build V1.2, you’d probably want to take the set of migration files from the V1.6 tagged release. This is because each new version of a Flyway project is likely to contain other bits of information that, while they aren’t needed for the actual build of that version, are important for the development process. As we’ve discussed, other types of files, such as undo or repeatable, can affect the code.

The way that you use source control is a team decision, and depends on how you’re structuring your Flyway projects, how you’re managing branches, in the first place. I like to download the latest tagged release of a branch of a Flyway project as a set of version files that use the ‘V’ prefix, and undo files (‘U’ prefix) for the migration, and, in a separate directory, the callback script files. I use this as the location for Flyway. I then use the version tag to retrieve the rest of the ephemera, which would include the version build script. This is what I’ll demonstrate, shortly, in my Get-TaggedGitRelease script.

GitHub uses a semantic version number whereas Flyway uses a four-number version. By representing the Flyway Version within the semantic version, it would allow us to update the contents of a version, and re-tag it, while maintaining the Flyway version. We could, in effect, get the latest tag marked with a particular Flyway version, for example, 1.1.3.

This whole system only works because I never cheat the determinism of the system by running repeatable (R) migrations, or by creating migrations, SQL or scripted, that do different things depending on a placeholder value. There is also the problem of callbacks that can affect the database metadata version and might be present at some versions but not others. All I can say is that Flyway was designed to keep things simple as long as you obey the spirit of the rules. If you game the system, all bets are off.

Using GitHub’s Rest API to fetch GitHub tags

GitHub uses the standard REST API for its application in order to allow different software systems to easily communicate with GitHub over the internet. With this API, you can get the contents of a tagged release by name, or you can simply get the latest release. You can also get a list of the existing tags and sort them to choose the tag you need, and then select the contents of that tagged release.

I’ve written a PowerShell cmdlet, Get-TaggedGitRelease, that sorts it all out for you. To use this, you first need a GitHub ‘Personal access token’ to allow you to use the REST API. This authenticates you as a GitHub user, allowing you to access public-access repos. To get it:

Sign into your GitHub account
Click on your profile icon on the top right corner
Select Settings from the dropdown menu
Click on Developer settings from the left menu
Select Personal access tokens from the left menu.

With this token, the PowerShell process can make authenticated requests to the GitHub REST API on behalf of your GitHub account. For public projects, the token’s permissions should be set according to the level of access you need. The first time you use the cmdlet, or when you change it, you need to provide your ‘Personal access token’ as a parameter. The Cmdlet then stores it in your user home folder. If you don’t provide one, the cmdlet uses the last one you used. If you do, it overwrites any stored one.

Build a named version from a tagged release

Before I reveal the code for the Get-TaggedGitRelease cmdlet, let’s see an example of how we’d use it. I’ve created a sample project called GithubDemo so you can test out the API. In this simple example, we’ll just set the Flyway credentials using environment variables:

1 2	$Env:FLYWAY_USER = 'Desperate_Dan' $env:FLYWAY_PASSWORD='MyInpenetrablePassword'

And now we can use the cmdlet to download a GitHub-tagged release (V1.2) and then build that version of the database. It gets the tagged release into to a local FlywayGitHubDemo directory (by default within the user area) and then uses it to migrate a Pubs database to the tagged version.

# We fetch the Zipball of the tagged release. quoting the owner and repo. (default is 'main' branch)

# This assumes that your Flyway userID and password are already set

Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'V1.2'

cd "$env:UserProfile\FlywayGitHubDemo" # this is where it gets placed by default

$params = @(

'-url=jdbc:sqlserver://<my Server>;databaseName=<MyDatabase>;encrypt=true;trustServerCertificate=true;',

'-schemas=dbo,classic,people,accounting',

'-placeholders.projectName=Pubs',

'-placeholders.projectDescription=A sample team-based Flyway project',

'-locations=filesystem:./migrations/sql,filesystem:./migrations/Callbacks',

# by keeping callbacks and version files separate you can run migrations

#without callbacks

'-placeholders.branch=main')

flyway $params migrate

You’ll notice, if you use the sample project on GitHub, that you also get downloaded a ‘state’ directory that provides you with the object model, object level source, build script, manifest, reports of the changes in the version and so on. These are generated by Teamwork as long as you are using one of the more common flavors of relational database. This ‘state’ directory is relevant to the version of the project you’ve specified. You can, with Flyway, build to any other version of the database covered by the migration run.

That’s all there is to it, in its simplest form, although obviously we can do more. To switch to a branch is fine. You just add the ‘commit-ish’ to the -tag parameter, but you need to change the Server and database to the one assigned to the branch. This may involve a change to the database credentials too.

The Get-TaggedGitRelease cmdlet

Finally, here is a PowerShell routine, called Get-TaggedGitRelease, that gets a tagged release of a Flyway project from GitHub, via the REST API.

In its simplest form, you just need to specify the name of the repository and the name of the version you want, which most of the time will be the latest version:

1	Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'latest'

Since we didn’t specify, it will, by default, put the retrieved files in a directory in your user area, with the same name as the project, and retrieve all filetypes (but you can configure these as you require using the -TargetFolder and -Filespec parameters). The first time you use it, you’ll also need to specify in the -credentials parameter your personal access token.

If instead you want a specified version, or the version before the latest version, just change the -tag:

1 2	Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'V1.1' -Filespec '' Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'penultimate' -Filespec ''

The header of the code gives more details of usage:

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

.SYNOPSIS

Gets the files of a git release, either the latest one or, if you specify the tag, the release that has that tag.

.DESCRIPTION

This is a, hopefully, reliable way of getting the latest release from GitHub, or a specific release. There are several examples on the internet but I couldn't get any to work. Git changes the protocol, but if you can get the correct path of the zip-ball of the files, you have a better chance.

.PARAMETER RepoPath

The name of the repository. e.g. Phil-Factor/FlywayGitHub

.PARAMETER credentials

The github 'Personal access token' that is provided by GitHub. You must provide this the first time that you use the cmdlet. After that it remembers it

.PARAMETER tag

This is the name of the tag. This will include the Branch, but if no branch is added, it assumes main.

.PARAMETER TargetFolder

This is the path to the destination where the code is saved. If you provide nothing then it uses the project name as a directory in your user area

.PARAMETER FileSpec

The list of types of file you want to have (e.g. *.sql,*.bat). I'd leave this at its default (*)!

.EXAMPLE

PS C:\>

Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'V1.1' -Filespec '*'

Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'latest' -Filespec '*'

Get-TaggedGitRelease -repopath 'Phil-Factor/FlywayGitHubDemo' -tag 'penultimate' -Filespec '*'

function Get-TaggedGitRelease

{

[CmdletBinding()]

param

(

[Parameter(Mandatory = $true,

ValueFromPipeline = $true,

Position = 1)]

[string]$RepoPath,

[Parameter(Mandatory = $false,

Position = 2)]

[string]$credentials = $null,

[Parameter(Mandatory = $false,

Position = 3)]

[string]$tag = $null,

[Parameter(Mandatory = $false,

Position = 4)]

[string]$TargetFolder = $null,

[Parameter(Mandatory = $false,

Position = 5)]

[String]$FileSpec = '*'

)

$RepoName = $repoPath -split '[\\/]' # get the owner and repo

$owner = $RepoName[0]; $repository = $RepoName[1]; $CredentialFile = $RepoName -join '_'

if ([string]::IsNullOrEmpty($TargetFolder))

{

$TargetFolder = "$env:UserProfile\$repository"

}

#now fetch the credentials from the user area

if (($owner -eq $null) -or ($repository -eq $null))

{

Write-error "we need both the owner and repository. e.g. Genghis/Kahn"

}

$CredentialLocation = "$env:UserProfile\$CredentialFile.txt"

if (!([string]::IsNullOrEmpty($credentials)))

{

$credentials > "$CredentialLocation" #assume it is a first time

}

else

{

#fetch existing credentials

if (Test-Path "$($CredentialLocation)")

{ $credentials = Get-Content $CredentialLocation }

else

{ Write-Error "Could not find a credential. GitHub needs a credential to authorise this" }

}

$ZipBallFolder = "$env:TMP\$repository\"

if ([string]::IsNullOrEmpty($TargetFolder)) #create the target folder if necessary

{ $TargetFolder = "$env:UserProfile\$repo\scripts" }

$WorkFolder = "$env:UserProfile\Work$(Get-Random -Minimum 1000 -Maximum 9999)"

#create the authentication header

$headers = New-Object "System.Collections.Generic.Dictionary[[String], [String]]"

$headers.Add("Authorization", "token $credentials")

#Create the basic API string for the project

$releases = "https://api.github.com/repos/$repopath/releases"

if ([string]::IsNullOrEmpty($tag)) { $Tag = 'latest' }

<# You might want the latest version of, say, 1.1.2, if someone has made changes then

they would increment the pre-release. The semantic version might be changed from

1.1.1-alpha to 1.1.2-beta. #>

if ($tag -eq 'penultimate')

{

$TheInformation = Invoke-WebRequest $releases -Headers $headers

$ReleaseList = $TheInformation.content | convertFrom-json

$TheLatest = $ReleaseList.GetEnumerator() | foreach{

$Badverion = $false; #We need to be able to sort this - assume the best

if ($_.tag_name -cmatch '\D*(\d([\d\.]){1,40})')

{

$Version = $matches[1]

}

else { $Badverion = $true; }

if (!($Badverion))

{

try { $Version = [version]$Version }

catch { $Badverion = $true }

}

if ($Badversion) { write-error "sorry but you must use numeric versions to get the latest" }

[pscustomobject]@{

'Tag' = $_.tag_name;

'Version' = $Version;

'Location' = $_.zipball_url

}

} | Sort-Object -Property version -Descending | select -First 2

if ($TheLatest.count -ne 2) { write-error "sorry but there is no penultimate tag" }

$Tag = $TheLatest[1].Tag;

$location = $TheLatest[1].Location;

}

else # we use the tag. 'latest' for latest

{

if ($tag -eq 'latest')

{ $URL = "$releases/latest" }

else

{ $URL = "$releases/tags/$tag" }

$TheInformation = Invoke-WebRequest $URL -Headers $headers

$TheReleaseInformation = $TheInformation.Content | convertfrom-json

$location = $TheReleaseInformation.zipball_url

$Tag = $TheReleaseInformation.Tag_Name

Write-verbose "Now we have tag $Tag from $URL"

}

#we now have the tag and the location

if (($location -eq $null) -or ($Tag -eq $null))

{ write_error "could not find that tagged release" }

else

{

Write-verbose "Downloading release $Tag to $ZipBallFolder"

# $headers.Add("Accept", "application/octet-stream")

# make sure that the folder is there

if (-not (Test-Path "$($ZipBallFolder)"))

{ $null = New-Item -ItemType Directory -Path "$($ZipBallFolder)" -Force }

#now get the zip file

if (-not (Test-Path "$($ZipBallFolder)"))

{ $null = New-Item -ItemType Directory -Path "$($ZipBallFolder)" -Force }

#now get the zip file

Invoke-WebRequest -Uri $location -Headers $headers -OutFile "$($ZipBallFolder)$Tag.zip"

Write-verbose "Extracting release files from $($ZipBallFolder)$Tag.zip to $WorkFolder"

#now unzip the Zipball

Expand-Archive "$($ZipBallFolder)$Tag.zip" -DestinationPath $WorkFolder -Force

$sourceDirectory = (dir $WorkFolder -Directory).name

#now remove just the directories in the target that we are copying

if ([string]::IsNullOrEmpty($sourceDirectory) -or [string]::IsNullOrEmpty($TargetFolder))

{ write-error "cannot delete existing content of your folder" }

else

{

$DirectoriesWeCopy = dir "$WorkFolder\$sourceDirectory\$filespec" -Directory | foreach{ $_.name }

$DirectoriesWeCopy | foreach{ Remove-Item "$TargetFolder\$_" -ErrorAction SilentlyContinue -recurse -Force }

}

Write-verbose "copying items from $WorkFolder\$sourceDirectory\$filespec to $TargetFolder"

copy-item "$WorkFolder\$sourceDirectory\$filespec" -Destination "$TargetFolder" -Recurse -Force

Remove-Item $WorkFolder -Recurse -force

Remove-Item "$($ZipBallFolder)$Tag.zip" -Force

}

Other joys of the REST API

Although raw GIT commands are fine with PowerShell, the REST API has particular value when you need to transmit information. Most data is returned as JSON documents and GitHub is almost obsessive in the way it packs in this information. I’m fine with extending the information-gathering about a project, such as examining the details of the current branches, the uncommitted files and so on. My own feeling is that the committing of code is too manual a process, what with inspecting the changes, testing, and sifting out the mistakes.

Conclusion

This model is a useful way and neat of working, particularly of working will suit a branch of a database where all code is committed before the database is built. How, though, would one test code before that stage? By the time I commit code to project source control, it’s been well tested and thought through. I’ve done timings, checked alternatives and submitted the work to peer-reviews. It could be that I’m a dinosaur in an age of tame creatures, but when on an isolated branch of a database, I need elbow room to work quickly. If you were to restrict this sort of system to just the development branch, you’d have to have two different systems in place. The important point with Flyway is that you get a lot of freedom in the way that you work. If your team prefers to work this way, then Flyway is fine with it!

Tools in this post

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway Database Builds and Deployments
Using a GitHub Tagged Release for a…

Guest post

Using a GitHub Tagged Release for a Flyway Migration

Guest post

The Flyway migration system

Flyway and Source control

Flyway and Git

Using GitHub Tags with Flyway

Using GitHub’s Rest API to fetch GitHub tags

Build a named version from a tagged release

The Get-TaggedGitRelease cmdlet

Other joys of the REST API

Conclusion

Tools in this post

Redgate Flyway

Redgate Flyway

Product Articles

University

Events

Forums

Community

Simple Talk

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway Database Builds and Deployments Using a GitHub Tagged Release for a…

Guest post

Guest post

The Flyway migration system

Flyway and Source control

Flyway and Git

Using GitHub Tags with Flyway

Using GitHub’s Rest API to fetch GitHub tags

Build a named version from a tagged release

The Get-TaggedGitRelease cmdlet

Other joys of the REST API

Conclusion

Tools in this post

Redgate Flyway

Redgate Flyway

You may also like

Redgate Hub

Product Articles

University

Events

Forums

Community

Simple Talk

Product articles Redgate Flyway Database Builds and Deployments
Using a GitHub Tagged Release for a…