Product articles
Flyway
Applying hotfixes
Cherry-picking Migrations using…

Cherry-picking Migrations using Flyway

Describing the two main use cases for occasional "cherry picking" of Flyway database migration scripts: back-filling emergency production hotfixes to version control and managing parallel development streams.

In an ideal world, application development follows a predictable path. You have a series of features that you want to build. As business and application requirements evolve, you will inevitably need to plan in database changes to support new and enhanced features. Over time, you build up a sequence of changes which then get pushed into test, staging and production environments in the same order, so that all your environments are faithful reflections of each other, and everything works the same wherever you try it.

Flyway naturally supports this happy path; each change to the database structure is represented as a migration script, and the file naming conventions make sure that Flyway always deploys changes in the correct sequence. Each new or altered database object that gets released from the development environment is reliably rolled out to every other one. What’s more, the migration scripts themselves act as a clear trail of what was done (who ran the migration and when) and can be replayed after the event for debugging purposes.

Of course, we don’t live in an ideal world, and sometimes we need to stray off the ‘predictable path’. Cherry-picking migrations in Flyway Teams allows us to navigate our way around these situations in a consistent, database-agnostic way.

The ad-hoc hotfix

Something has gone wrong and our application has started throwing errors. It could be that we’ve released a change that seemed fine in a relatively quiet test environment but isn’t coping in a production environment under heavy load. Alternatively, we may have overlooked something in our production environment that isn’t reflected elsewhere, such as a more complicated user permission setup. Whatever the cause, it needs to be fixed as soon as possible to restore service.

The first action we might consider would be flyway undo. This rolls the database schema back to the state it was in prior to the most recent migration. However, this is liable to be a destructive operation, deleting valuable live data, which you will need to find a way to preserve, and requiring a rollback of all applications that use the database. It would certainly be a risky thing to do in a high-pressure situation.

More likely, though, the DBA who arrives on the scene may see a simple fix. It might be a missing index causing queries to time out, or an account with inappropriate permissions. Often, a line or two of SQL is enough to put the fire out. Strictly going “by the textbook”, they would write a migration with this SQL snippet, run it through all the lower environments, and eventually promote it into production. But that will take time, a luxury that is not always available. The DBA may also be unaware of work is in flight in development, not yet ready to promote. He knows the change has an acceptably small risk of harmful side effects and confidently executes his SQL on production. The errors stop coming through and peace is restored.

Emergency database hotfix

However, the developers’ migration scripts are no longer a faithful representation of how the production database came to its current state. So, our DBA saves the hotfix statements as a new migration, numbered as the next one in sequence appropriate for the production database. In the example above, with the database schema at V4, use of V4.0.1 both keeps the sequence intact and conveys the fact that it’s a hotfix. It can then be added to the overall collection. Ideally this script would be idempotent, meaning that the code itself will check whether the changes are in place and will only make the changes where necessary. However, this is not always possible.

using Flyway cherryPick

In our development environment, we can simply use flyway migrate and the hotfix will be included in the correct sequence. However, we don’t want to re-apply the hotfix to production, and so we use cherry-picking to apply just the next application update to the production database. Our hotfix migration remains as a record of what was done. This gives us a standard approach to approach to dealing with occasional ‘out-of-process’ hotfixes, across all database systems.

Parallel changes

Flyway is designed primarily to support simple, trunk-based development, with minimal use of branching. However, with a large application, it’s quite common to have multiple development teams working on different features in parallel. If they are working on the same database objects, with a single Flyway history table keeping track of the changes, they will need to coordinate with each other over the expected ordering of migration scripts in any deployment to the production environment, so that when all their changes are integrated, the database is in the intended state.

Managing parallel development streams

However, once again, the unpredictable real world intervenes and the earlier feature (V5) needs to be held back, because technical problems have emerged, or company priorities have changed. If the team are using long-lived feature branches, they can reconcile the sequence of migrations as and when branches are merged. In trunk-based development, though, migration scripts are merged up to the trunk as soon as they are tested ready for shipping. While the application change may be hidden with a feature flag, the database schema change may be harder to deal with. The developers have two options to keep everything on track. First, they can renumber their scripts to reflect the new order. If the changes have progressed no further than their own machines, and they can tear down their local databases and start again painlessly, that may be a good option. If the migrations are already in source control, and test and staging environments have already been updated, then it will require a lot of unpicking, by hand, to realign the code and the database’s schema history table.

The other option is to apply changes to production in the order in which features are going to be released. To do this, we first cherryPick the higher-numbered changes. Flyway understands that it is skipping over the older changes and applies just the migrations that we want for now.

using Flyway CherryPick and outOfOrder migrations

In the example above, the first cherry-pick results in migrating a database at version 4 so that it contains changes ready for version 6, but not version 5. Further ordinary migrations will apply later numbered migrations, but any script files from version 5 will be marked as not to be applied for now; flyway info will reveal that they are in a special “skipped” state. At some future point in time, when our delayed feature is ready, we can arrange for Flyway to execute the corresponding migration with another cherry-pick, this time including the outOfOrder option to indicate that we are aware that our migration has a lower number than the last one to be executed so far. The gap in our sequence of migrations is then fixed and we can revert to performing normal migrations.

This is a good example how Flyway can be used flexibly to support almost any type of team working and provides a database-independent, and methodology-independent deployment process

However, while cherry-picking migrations like this is a powerful tool, it needs to be used carefully. Used injudiciously, it will complicate your schema history and make it harder to keep your database schemas consistent across environments. We hope you rarely, if ever, find yourself in a situation where you need it, but if you do then it should help get you back to your normal process smoothly and safely.

Summary

Flyway supports automated databases deployments, whatever the type of database, or the team’s preferred method of developing databases, or the type of application. This has required it to be highly adaptable, adding features to help the team deal with tricky development problems, and to allow for events and circumstances that require an urgent response, and actions that are not “in the textbook”.

Cherry-picking is a good example of this. However, it is sometimes tricky to reconcile use of this sort of feature with the disciplines that allow for error-free deployments. Cherry-picking migrations will solve some seemingly intractable problems but use it without thought or planning and there is a risk that it will defeat the inherent safety of the deployment procedure.

Tools in this post

Flyway

DevOps for the Database

Find out more

Flyway Teams

Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.

Find out more