Building a Database with Flyway
Flyway, especially Flyway Teams edition, can be used in several different ways to accommodate a database development that was originally based on builds rather than migrations. This article explores four different ways to use Flyway to build a particular version of a database, from the ground up, using a single migration script. It should help teams select the best way to incorporate Flyway into an existing database build system, during development, while benefitting from use of Flyway's versioned migration system for deployments and releases.
Using Flyway as a database build service
Flyway has its origins in a particular development culture where the application and database had to grow together in perfect synch, like ballroom dancers. To manage the strict ‘versioning’ that this required, it evolved several interesting features that have a more general use and have since been widely adopted for Continuous Delivery (CD). For example, it provides transaction-based rollback for database changes, and a versioning system that makes it far easier to keep all copies of a database updated.
Flyway runs a series of versioned migration scripts, each script describing the changes required to move a database from its current version to the next, while preserving all existing data. When a migration completes successfully, Flyway records the new version in its schema history table. This brings to the development process a standard way of establishing the required version of a database, a history of how it got to that version, and more reliable way of bringing a copy of a database to the version you need.
However, like many tools that have evolved to meet the requests and suggestions of their users, it will adapt to many ways of developing database applications. For example, Flyway will slot in as a replacement for a traditional database build system if that’s what you currently need. It is accommodating enough to do what is essentially a ‘build’, as a migration.
There is no single method for building a database because there is no single means of making changes to a database. Whichever way we choose to do it, the important objective is to include all the required metadata (tables, views, procedures and functions), and a standard dataset that is suitable for testing.
To deliver updates to an existing database, such as the production database, the team will simply use a versioned migration script to modify the metadata to bring it to the target version, whilst leaving the existing data intact. The last time I came across a production database that had to be updated by rebuilding it (take it offline, export data, build new version, import data, bring it back online) was before the millennium. We had needed to rearrange a few tables. When we informed the business of our plan, they were shocked that we felt it necessary to take the database offline, just to move furniture around the office.
Four ways to build a database from scratch, using Flyway
We can do it using a ‘fat’ migration script, a baseline migration script, a repeatable migration script, or a script migration. In each case, Flyway provides a build performed as an all-or-nothing operation, which makes it easier to recover from an error. However, using Flyway as a build system, in these ways, can also compromise some of the checks and safeguards that it performs to maintain version-integrity. Without these safeguards, you can end up with the type of ‘version chaos’ where work gets sidelined or lost, and that increases the risk of errors in deployment. I’ve highlighted disadvantages of each method in the subsequent descriptions.
Before we take a broad view of these alternative ways of using Flyway as a ‘build system’, a word about data. Importing the data is best considered a separate, post-migration task, because data doesn’t and shouldn’t affect the version of the database. However, in the demonstrations that follow, I simply include it as part of a versioned migration, so that the database can then be tested and used.
The single fat migration
In Flyway, a build script is little more than a migration from a blank database to a particular version. When Flyway detects an empty database, with no existing flyway schema history table, it will build a database from whatever versioned files you provide. In this case, we provide just a single build source file, which should include the version number and database.
If you have an existing build script, you’ll need to alter it to assume that the database and schemas are already created (you then specify these schemas in the flyway.conf file so that Flyway will create and manage them automatically). The build script must also contain no transactions. Since the build script is simply a versioned migration, we can easily then apply subsequent ‘patches’, as migrations, to other copies of the database. Flyway will apply any of these, in version order, when you ask it to.
- Advantage: Works with every edition of Flyway. Segues naturally into a migration-based approach.
- Disadvantage: requires a Flyway Clean action on every change to the build file, meaning that no history is maintained. Previous versions must be retrieved from source control.
The broad baseline script
Flyway Teams adds Baseline scripts (a.k.a. ‘State’ scripts) that allow you to roll up a set of migrations into a single script that takes the database to a particular version from zero, i.e., from an empty database. The difference between using a baseline and using a ‘single fat migration’ is that the baseline script tells Flyway to ignore any migrations with am equal or lower version number, so it is particularly useful in a Flyway environment where you have a large number of migration files that need to be by-passed.
Baseline scripts were intended to supplement a conventional migration, when provisioning new copies of a database. However, they are close to a traditional database build script and can be used as such even when there are no migration scripts.
- Advantage: Allows a mixed usage of migration and build scripts
- Disadvantage: Flyway cannot currently verify the baseline file by checking that the baseline file builds exactly the same version of the database as the original sequence of migration files.
The insidious repeatable
Repeatable migrations have a description in the filename but no version. Flyway runs a repeatable file, as part of a migration run, every time it detects that its checksum has changed. A normal migration works because it can assume a previous state of the database, but this isn’t true if repeatables are part of a migration run because other migrations may have made changes to these objects. Each repeatable file written for a migration would need to be idempotent. It would have to check what currently exists before making changes.
Repeatables have only a niche use in a migration, for bulk data inserts and for creating (and recreating) packages and procedures, and even then there are better ways of handling this, in my opinion.
They can only really be used reliably for builds as long as, before running the migration, you add a minimal, additional migration file that updates Flyway’s record of the database version. To be useful, this can contain documentation that records what triggered the migration.
- Advantage: allows Flyway to adopt an existing build system which consists, perhaps in part, of components such as a schema or logical part of the database having its own build script
- Disadvantage: This doesn’t allow a smooth transition to a migrations-based approach. Repeatables are fine if version numbering for the database is maintained within the version control system, but not when we are relying on Flyway migrations.
The Java, Batch or PowerShell migration
With Flyway Community, it is possible to do Java-based migrations. A Java migration can easily execute a SQL build script, load data or make all sorts of other changes. With Flyway Teams, this is extended to batch files and PowerShell files. These are very versatile.
The chief difficulty with creating this type of file is that Flyway does not pass sufficient information to the script to manage a broad range of activities such as logging into a database server or reporting. This was one of the main reasons that I wrote the Flyway Teamwork framework, which gathers the information and passes it as a hashtable.
- Advantage: Allows you a build with the fewest restrictions, but the maximum DIY requirement.
- Disadvantage: You lose a major feature of Flyway, the transaction-based migration with rollback on error, because a scripted execution of a file will use a different connection.
Trying out the alternatives
Now we’ll try out the four different approaches with SQL Server, using a build script for Adventureworks and a set of sample data files. I’ve prepared these as much as I can, so you can play along or experiment.
Preparing the build script and data
First, you will need to get at least the data files for Adventureworks from Github. I used the source code in the assets section. The sql-server-samples-adventureworks\samples\databases\adventure-works\oltp-install-script directory contains a build script (instawdb.sql) that also bulk loads data from the accompanying CSV files. This script assumes that you are loading the database into a SQL Server on your workstation or laptop, which isn’t ideal.
To do a proper server install, I copied the content of this directory to a suitable directory on the Windows Server that had SQL Server installed. This SQL Server instance had file-based read access, for the login I used, so that it could import the CSV data in bulk.
I needed to make various changes to the instawdb.sql build script to get it to work. I advise a cup of coffee before attempting it yourself, and I’ve provided a modified version of the Adventureworks build script I’ll use for this demo.
First, I opened the instawdb.sql in SSMS and commented out all the database creation code at the beginning, as well as all schema creation. Flyway does all the schema-creation for us, but we must prepare the empty database. These were the only changes I had to make, for Flyway.
I then replaced the $(SqlSamplesSourceDataPath)
placeholder with the actual path on the server where the data was held. In my case, this was D:\Database\AdventureWorksData but of course you’ll need to change this, as required (it would be useful, here, to provide a Flyway placeholder in the config file because it is likely that the location will be different on different servers!). Likewise, I set the $(DatabaseName)
to the name of the database (AdventureWorks).
The final changes were necessary to get the script to work in SSMS with SQL Server 2016 onwards, without needing to install Full Text. Unless you have it installed on your server, you’ll need to do likewise and remove all references to full text search, unless from the script.
This means deleting [dbo]
.[uspSearchCandidateResumes]
procedure, the fulltext catalog and any fulltext indexes, such the following one on the Document
table:
1 |
CREATE FULLTEXT INDEX ON Production.Document(Document TYPE COLUMN FileExtension, DocumentSummary) KEY INDEX PK_Document_DocumentNode; |
Now, remove the following three lines from the BULK
INSERT
statement for every table:
1 2 3 |
CHECK_CONSTRAINTS, CODEPAGE='ACP', DATAFILETYPE='char', |
This means that, for example, that the BULK
INSERT
for the [Production]
.[WorkOrderRouting]
table should look like this:
1 2 3 4 5 6 7 |
BULK INSERT [Production].[WorkOrderRouting] FROM 'D:\Database\AdventureWorksData\WorkOrderRouting.csv' WITH ( FIELDTERMINATOR='\t', ROWTERMINATOR = '0x0a', KEEPIDENTITY, TABLOCK ); |
Finally, add these lines at the end of the bulk loading section.
1 2 |
PRINT 'enable constraints' EXEC sp_msforeachtable "ALTER TABLE ? WITH CHECK CHECK CONSTRAINT all" |
That’s it; you can now run the script in SSMS and check that it works. I’ve found that if something works in SSMS on an empty database with the required schemas created, it will work with Flyway. I’ve saved a version of this file in GitHub, along with its’ UNDO twin.
Setting up the Flyway project
To get this running with Flyway Teams, I created a new AdventureWorks project within my Flyway Teamwork framework. You’ll need to add a flyway.conf file to the project folder with the following settings (recent updates of Flyway have sometimes required the 'encrypt=true;trustServerCertificate=true;'
tail to the connection string):
1 2 3 4 5 6 |
flyway.url=jdbc:sqlserver://<MyServer>;databaseName=AdventureWorks;encrypt=true;trustServerCertificate=true; flyway.schemas=dbo,HumanResources,Person,Production,Purchasing,Sales flyway.placeholders.projectName=BuildExample flyway.placeholders.projectDescription=A demonstration of doing builds in Flyway projects flyway.locations=filesystem:./migrations flyway.placeholders.Branch=Main |
The Migrations subdirectory contains three migration files that build the tables, add data and then add indexes and constraints, in a fairly conventional migration-based approach. I also added an afterMigrate
process that stamps the details of the version in the extended properties (comments) of the database:
I created each of the four ‘build’ versions as variants in the ‘Variant’ subdirectory.
Testing the single fat migration
We load a copy of the Adventureworks build script, modified as described earlier, into the Migrations subdirectory of the FatMigration variant. You need to rename it with a filename beginning with ‘V’ and followed by the version number.
We now create our empty target database on the server with its own unique name and add to the FatMigration directory the flyway.conf file with the configuration details (plus the other required files). If you use the same server, the connection string will just need to be amended with a different database name for the variant database, and with the line flyway.placeholders.Variant=fatMigration
We open a DOS prompt or PowerShell session, make the variant directory our current working directory…
1 2 3 4 5 |
cd <MyPathTo…>\AdventureWorks\Variants\FatMigration . '.\preliminary.ps1' Write-Output @" Processing the $($dbDetails.variant) variant of $($dbDetails.branch) branch of the $($dbDetails.project) project using $($dbDetails.database) database on $($dbDetails.server) server with user $($dbDetails.installedBy)" "@ |
…and type in…
1 |
Flyway migrate |
…and, as if by magic, the database will be built on the server you specify, and work happily. If we then get information about the single fat migration, we can see that all is well.
PS S:\work\Github\FlywayTeamwork\AdventureWorks> flyway info -outputType=json { "schemaVersion": "2017.1", "schemaName": "dbo, HumanResources, Person, Production, Purchasing, Sales", "migrations": [ { "category": "", "version": "", "description": "<< Flyway Schema Creation >>", "type": "SCHEMA", "installedOnUTC": "2022-06-08T09:37:04.847Z", "state": "Success", "undoable": "", "filepath": "", "installedBy": "Phil Factor", "executionTime": 0 }, { "category": "Versioned", "version": "2017.1", "description": "Adventureworks Build", "type": "SQL", "installedOnUTC": "2022-06-08T09:41:54.260Z", "state": "Success", "undoable": "No", "filepath": ".\\migrations\\V2017.1__Adventureworks_Build.sql", "installedBy": "Phil Factor", "executionTime": 288668 } ], "allSchemasEmpty": false, "flywayVersion": "8.3.0", "database": "AdventureWorks", "warnings": [], "operation": "info" }
As this is just a versioned migration, if we want to make changes to the database, we can add SQL patches (migration scripts) to make alterations.
Test out the broad baseline script
Let’s now try building the database using a Flyway baseline migration file. I’m using the same build scripts as before but just with the B prefix and a different version number (version 2017.10). Just to demonstrate that Flyway will ignore any versioned migrations with lower version numbers, I’ve also included the 01-03 migrations. These are on Github here.
Once more, we make the Baseline variant our current working directory, create a new copy of Adventureworks with a unique name on the server, and alter the flyway.conf accordingly. Then, we execute…
1 2 |
Flyway migrate flyway info -outputType=json |
The flyway info
command should give you something like this…
flyway info -outputType=json { "schemaVersion": "2017.10", "schemaName": "dbo, HumanResources, Person, Production, Purchasing, Sales", "migrations": [ { "category": "", "version": "", "description": "<< Flyway Schema Creation >>", "type": "SCHEMA", "installedOnUTC": "2022-06-08T14:11:25.113Z", "state": "Success", "undoable": "", "filepath": "", "installedBy": "Phil Factor", "executionTime": 0 }, { "category": "Versioned", "version": "2017.10", "description": "Adventureworks Build", "type": "SQL_BASELINE", "installedOnUTC": "2022-06-08T14:14:19.897Z", "state": "Baseline", "undoable": "No", "filepath": "", "installedBy": "Phil Factor", "executionTime": 173996 } ], "allSchemasEmpty": false, "flywayVersion": "8.3.0", "database": "AdventureWorksB", "warnings": [], "operation": "info" }
It performed the SQL Callback to insert a note of what happened as a comment for the database.
You’ll have noticed that I’ve edited the values of flyway.placeholders.projectName
and flyway.placeholders.projectDescription
in the flyway.conf file so that they show up in the version stamp.
The Flyway migration run left the migration files alone because the version numbers they have is lower than the baseline. Had they been higher, they’d have been executed after the baseline, producing a sea of red onscreen.
Test out the Insidious Repeatable
Here, we’ll need to add a cleanup routine to the Adventureworks build script, to remove everything that was there before. As we are building from scratch, we could easily put in a generic routine that does what the Flyway Clean command does, but we don’t want to do that because we want to preserve the history, and we would need a tear-down for every ‘repeatable’.
However, this means that the ‘repeatable’ build file is doomed to be the only file in the migration locations. Repeatables were intended for scripts that use the ALTER
command and would really need to be one per database object, since one would be unlikely to ALTER
a whole range of objects in one iteration.
So, rather than a generic ‘clean’ routine, we’ll just sigh and add a section in the build script to clean away everything that was there (it’s the same code that is in this undo file in the FatMigration project, but this time just added to the start of the build script).
We need to provide a version number to the final database so that will have to be a separate file, which we can’t change. We just have to add dummy files every time we want a new version of the database. Thank goodness AdventureWorks doesn’t change much! The AdventureWorks bike-sellers must have a perfect corporate business model.
When we run Flyway -migrate, we get this. We can see that the version is correctly entered.
+------------+---------+-------------------------------+------------------+----------+ | Category | Version | Description | Type | State | Undoable | +------------+---------+-------------------------------+------------------+----------+ | | | << Flyway Schema Creation >> | SCHEMA | Success | | | Versioned | 2017.1 | Adventureworks | SQL | Success | No | | Repeatable | | Adventureworks Kill and Build | SQL | Success | | +------------+---------+-------------------------------+------------------+----------+
If we want to add something, we’ll need to add a dummy file with a new version number, otherwise it will stay at the version that we’ve set it at with that dummy migration. Flyway won’t approve if we merely change the name of the file to a version number of 2017.2
Detected applied migration not resolved locally: 2017.1. If you removed this migration intentionally, run repair to mark the migration as deleted. Need more flexibility with validation rules? Learn more: https://rd.gt/3AbJUZE
If you add a blank version file every time you make a change to the repeatable file, it might provide a good way of transitioning from a build system to Flyway. After all, you can add patches and change scripts to these files and find that a versioned patch file represents an easy and visible way of making changes.
Test out the Java, Batch or PowerShell migration
Here, we’ll merely use a PowerShell script to execute a SQL build file in the directory. The SQL file will be ignored by Flyway. It is actually executed by SQLCMD via the script file. This PowerShell script, through the magic of the Teamwork framework, is pretty simple.
The first line loads the framework code afresh and gets hold of the current flyway settings so that it can access the database. The Process-FlywayTasks
cmdlet then executes a scriptblock that does all the work of executing the file.
1 2 3 4 5 6 |
. '.\preliminary.ps1' Process-FlywayTasks $dbDetails $GetdataFromSQLCMD @( "", #query ".\Migrations\Adventureworks_Build.sql",#file to execute $true #simple query ) |
When we run the Flyway migrate command, we see this….
Successfully validated 1 migration (execution time 00:00.102s) Creating schema [HumanResources] ... Creating schema [Person] ... Creating schema [Production] ... Creating schema [Purchasing] ... Creating schema [Sales] ... Creating Schema History table [AdventureworksS].[dbo].[flyway_schema_history] ... Current version of schema [dbo]: null Migrating schema [dbo] to version "2017.1 - BuildAdventureworks.ps1" Executing powershell \\Github\FlywayTeamwork\AdventureWorks\Variants\script-based\.\migrations\V2017.1__BuildAdventureworks.ps1 FlywayTeamwork framework loaded. V1.2.138 Executed GetdataFromSQLCMD Successfully applied 1 migration to schema [dbo], now at version v2017.1 (execution time 02:55.492s) Executing SQL callback: afterMigrate - Add Version EP WARNING: DB: Recording the database's version number - 2017.1 (SQL State: S0001 - Error Code: 0) WARNING: DB: CREATE_EXTENDED_PROPERTY - .AdventureworksS (SQL State: S0001 - Error Code: 0)
This way of doing the build is the arms-length approach. You can now make endless changes to the actual build script without Flyway even raising an eyebrow. It doesn’t check the checksum of the SQL Script file for changes because it is outside its’ remit. If you were to do this to a migration file that had been successfully applied, Flyway would have reported this and aborted the migration.
This way of running a build script, of course, makes a nonsense of the whole point of using Flyway, because Flyway cannot check whether you are slipping in changes and thereby ‘drifting’ from the version. However, I’ve seen requests from users who wish to execute a whole directory of SQL files within a single migration. This would work fine with this approach but if the system hits an error, SQLCMD is poor at error-reporting and you’ve a fine mess to sort out.
Conclusions
With what we have done, we now have five identical databases, four variants and the conventional migration. You’ll have probably come to the same conclusion as me: if you are used to doing a build, and you wish to convert to Flyway, then you can use the ‘Fat Migration’ method pretty safely. It will do the build as before and stamp the result with a version number. However, like all these other methods, you will only get the advantages of keeping all your copies of the database at the same level later, when you can provide a chain of migration files to each subsequent database version that you need to maintain.
Of the other methods, the Broad Baseline is more appropriate when you want to stop having to use a glut of ‘historic’ migration files that are to versions in which you’re no longer interested. The ‘Insidious repeatable’ is just too difficult to create and use responsibly for this purpose, because if a build becomes the start of a migration, it cannot be certain of the ‘state’ of the database before it is executed. Even a single file must be full of guard clauses. Even if used successfully, the versioning is compromised. The PowerShell migration has only a very specialised use, such as for executing several SQL files within a single migration, or for adding a dataset as a migration.
Tools in this post
Flyway Teams
Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.