Dry Runs for Database Migrations using Flyway
How Dry Run scripts work, and how they can be used to deliver a single-script release artifact to Staging, verify placeholder substitutions in SQL migration scripts, and simplify team code reviews.
Instead of executing the SQL against your specified database during a migration run, Flyway can, in the Teams or Enterprise editions, deliver it to you unexecuted, as a “dry run” script file.
Why, you might be wondering? There are several unrelated, good reasons for this.
- Firstly, you can use the dry run script as the deployment artefact. If your release process frowns upon use of Flyway to do the deployment to Staging, you can simply execute the dry run script and the Flyway schema history table will be updated as normal.
- Secondly, you can use it to check that your Flyway placeholders have done what you expected to your SQL Scripts
- Thirdly, you can use it for team-based reviews before merges, or before code is committed to the production branch. By doing spot checks on the SQL that is going to be migrated, potential problems can be avoided.
Delivering the release artefact
A deployment or release process can take many different forms but once a team has found a process that works, and has the support of all participants, any change in practice will be greeted with caution. In many cases, the ‘release artefact’ that is inspected, used for testing, and for deployment to Staging is just a migration script.
Few publishers of database-driven applications will be able to insist that their client’s IT department change their approach to using Flyway for upgrading the production databases, or to allow access to that database. In this case, a database upgrade must be done by sending them a SQL file to apply.
The advantage of using a dry run script, in such cases, is that it updates the Flyway schema history table of the target database to reflect the new version. This means that you can execute these scripts safely, outside Flyway and so, during a deployment, Staging wouldn’t need Flyway, only the upgrade code. Before executing the dry run script, you will need to check manually to ensure that the target database is at the correct starting version.
This reduces the chances of anything going wrong and allows a database release to have a single upgrade script that can be archived separately. It is possible to use a dry run script do an entire run and thereby create a database from scratch for provisioning purposes, but the script doesn’t create the schemas for you.
Checking Flyway placeholder substitution
I find dry run scripts most useful when checking the results of placeholder substitutions in SQL code. I’ve prepared a rather extreme example of this, which I’ll demo shortly, with a project that will compile the same code for four very different RDBMSs, abstracting the differences by having different values for parts of the code that exhibit differences from SQL-92.
However, there is a broader use of placeholders in SQL code, such as for locating the Flyway schema history table, or version-stamping the database. The problem is that, without a dry run, it is impossible to see how the substitutions were made. Usually, the error you get if things go wrong will tell you the line, but it won’t tell you the whole context.
Team-based code reviews
Dry runs are also a useful way of reviewing code when there are any doubts at all about what SQL code is going to be executed. Where there is a team-based review process, it helps to be able to inspect the code of the whole migration run before it is applied but after placeholder substitutions have been made. DBAs, security experts or business analysts all might want a chance to inspect code before the die is cast and the code becomes part of the ‘canonical source’.
This review process could range from something as simple as running some static code analysis, to running more extensive code quality checks, to a full-blown peer-review process. Any team-based review process is most likely to be required before a merge operation. Whether it is feature branches being merged into the develop branch or the develop branch being merged into main or production, there will be a search for clashes and up-front tests to make sure that nothing has or will be been broken. With dry runs, you prevent further migrations having to be added to correct faulty migrations. They are much more suited to this task than the undo scripts, use of which is best-suited to a branch development.
As with many Flyway Teams features, dry run scripts are there if you need them as part of your development processes, but you’re not compelled to change the way that you work. They are almost all there to support team processes and I’ve never come across two database development teams who work exactly alike.
How Dry Run scripts work
When Flyway does a ‘Dry Run’, it sets up a read-only connection to the database. It reads in all the migration files that are needed for the run and uses them to generate a single SQL file containing all statements it would normally have executed. This SQL file also contains the necessary statements to create and update Flyway’s schema history table so that all schema changes are tracked in the usual way.
To ensure that the changes executed from a dry run script match what’s in the migration files, it is best to keep the migration files as the primary source. This means that no changes should be made directly to a dry run file. Instead, you should make any required changes by adding new migration files, and then generate a replacement dry run script. This prevents alterations getting lost.
Limitations of dry run scripts
Flyway has many features beyond the simple migration, and some changes cannot be intercepted and will instead be executed as normal. Flyway can deal happily with SQL versioned migrations, SQL repeatable migrations and SQL callbacks.
However, Flyway cannot determine what SQL is executed against a database from a DOS, Java or PowerShell script. This is because they do not share the same connection to the database. This means that any database changes made by arbitrary script migrations or script callbacks will still go ahead during a dry run and won’t be captured in the dry run script. The best advice I can give is that you can’t currently use dry runs and, at the same time use script migrations or script callbacks that make any database changes.
Trying out a dry run
Probably the easiest way to demonstrate dry runs is to try one out. Here is a section of the SQL92 build script (V1.1.1__Initial_Build.sql) for the PubsAgnostic project. By using placeholder substitution, we can use a single build script to recreate the original Pubs database on four different relational database systems.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
/*TitleView View */ ${viewStart} CREATE VIEW ${schemaPrefix}titleview AS select title, au_ord, au_lname, price, ytd_sales, pub_id from authors, titles, titleauthor where authors.au_id = titleauthor.au_id AND titles.title_id = titleauthor.title_id; ${viewFinish} /*ByRoyalty procedure*/ ${procStart} CREATE PROCEDURE ${schemaPrefix}byroyalty (${arg}percentage int) ${procBegin} select au_id from titleauthor where titleauthor.royaltyper = ${arg}percentage; ${procEnd} ${procFinish} /*reptq1 procedure*/ ${procStart} CREATE PROCEDURE ${schemaPrefix}reptq1 ${emptyProcArgs} ${procBegin} (select pub_id, avg(price) as avg_price from titles where price is NOT NULL group by pub_id) union all (select 'ALL' as pub_id, avg(price) avg_price from titles where price is NOT NULL); ${procEnd} ${procFinish} /*BReptq2 procedure*/ ${procStart} CREATE PROCEDURE ${schemaPrefix}reptq2 ${emptyProcArgs} ${procBegin} (select type, pub_id, avg(ytd_sales) as avg_ytd_sales from titles where pub_id is NOT NULL group by pub_id, TYPE) union all (select type, 'ALL' as pub_id, avg(price) avg_price from titles where price is NOT NULL GROUP BY type) union all (select 'ALL' as TYPE , pub_id, avg(price) avg_price from titles where price is NOT NULL GROUP BY pub_id); ${procEnd} ${procFinish} |
You’ll notice the placeholders enclosed in ${}
. Flyway will substitute the correct additions for the dialect of SQL being used.
To run the build, I temporarily amended the UNDOBuild.ps1 script. All I did as to add the Flyway argument, -dryRunOutput
, to provide the filename and path to an output file for the script, as follows:
1 2 3 4 5 6 7 |
$FlywayUndoArgs += <# the project variables that we reference with placeholders #> @( "-schemas=$($Details.schemas)", "-placeholders.schemas=$($Details.schemas)", #This isn't passed to callbacks otherwise "-placeholders.projectDescription=$($Details.ProjectDescription)", "-placeholders.projectName=$($Details.Project)" "-dryRunOutput=`"<pathtoOutputFile>`"") |
Where <pathToOutputFile>
is the output file for the SQL, including the path.
Execute the script and then you can execute a Flyway migrate
run. e.g.:
1 |
Flyway @FlywayUndoArgs migrate '-target=1.1.1' |
Note that although this -dryRunOutput
parameter should only work for migrate
or undo
commands, it also currently works for info
commands, which provides results that aren’t useful. Here is a section of the resulting dry run script from the migration, with the SQL as it would be executed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
/*TitleView View */ CREATE VIEW titleview AS select title, au_ord, au_lname, price, ytd_sales, pub_id from authors, titles, titleauthor where authors.au_id = titleauthor.au_id AND titles.title_id = titleauthor.title_id; CREATE PROCEDURE byroyalty (percentage int) Begin select au_id from titleauthor where titleauthor.royaltyper = percentage; end;// CREATE PROCEDURE reptq1 () Begin (select pub_id, avg(price) as avg_price from titles where price is NOT NULL group by pub_id) union all (select 'ALL' as pub_id, avg(price) avg_price from titles where price is NOT NULL); end;// CREATE PROCEDURE reptq2 () Begin (select type, pub_id, avg(ytd_sales) as avg_ytd_sales from titles where pub_id is NOT NULL group by pub_id, TYPE) union all (select type, 'ALL' as pub_id, avg(price) avg_price from titles where price is NOT NULL GROUP BY type) union all (select 'ALL' as TYPE , pub_id, avg(price) avg_price from titles where price is NOT NULL GROUP BY pub_id); end;// CREATE PROCEDURE reptq3 (lolimit decimal(19,4), hilimit decimal(19,4), type char(12)) Begin (select pub_id, avg(price) as avg_price from titles where price > lolimit AND price < hilimit AND type = type OR type LIKE '%cook%' group by pub_id) union all (select 'ALL' as pub_id, avg(price) avg_price from titles where price >lolimit AND price <hilimit AND type = type OR type LIKE '%cook%' group by pub_id, type); end;// |
There is no attempt to wrap each separate migration into a transaction, so if you wished to use the dry run script as a release artefact, you’d need to manage that yourself. Also, there is no check to make sure that the target database you execute against is at the correct version.
The output from a dry run is just the SQL that would have been executed to perform the migration. If you execute the resulting dry run script in SSMS, you’ll see that the correct entries are made in the Flyway schema history table.
Conclusions
The concept of a dry run is simple, and best kept so. It just represents the SQL that would have been executed, if you had gone ahead with the Flyway migration run. If you intend to use a dry run script to upgrade an existing production database on a remote site, you’ll need to wrap the dry run script in a script that first ensures that the migration is for the correct version of the database, and if so can, where possible, ensure that the script is executed in a transaction.
Tools in this post
Flyway Teams
Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.