Product articles
Flyway
Database Branching and Merging
Moving to Team-Based Database…

Moving to Team-Based Database Development with Flyway

Describing a route from a basic, 'managed' system of database development to use of branching and merging and CI, using Flyway. By taking these steps, you'll reduce development conflicts, lift testing restrictions, and the organization will have much more flexibility on the release of features and bugfixes.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

The first step from ‘chaotic’ to ‘managed’ database development is all about making it easier and safer to release new database functionality into production. The aim at the next level, ‘Branch/Merge‘, is to improve development productivity and to allow larger teams to collaborate effectively on development. Development processes need to scale out to accommodate branch-based database development and to provide greater freedom for experimentation and testing. The merge process, where the branch is committed to the database, must incorporate code quality checks, integration tests and team reviews. To move to the ‘Continuous Integration‘ level as much as possible of the processes established previously must be automated, so that the release process remains safe and reliable, even at an increased cadence of delivery.

This can be summarized as follows:

Level Build tool Source Control Artefact Advantage Issues
Managed Flyway Community or SoC Object-level script directory (SoC) or migration scripts (FC) Can track changes, produce an audit trail, and reproduce any prior version Flyway Community is less versatile in the extent of scripting that is possible.

SoC provides clear feedback of changes.

Branch/Merge Flyway Teams Object-level script directory, migration scripts (SQL, Java, PowerShell Bash or DOS) Can delay or pull forward features. Supports variants. Reduces migration-file clutter. Encourages freer development work and experimentation Unless branch tasks are reasonably independent of each other, Merging becomes a manual process that requires skill and negotiation
Continuous integration Flyway Enterprise Object-level script directory, migration scripts Can release in small increments for easier rollback and for testing Requires automation scripting, scripted tests and good source-control tools

Flyway and team-based development

Flyway imposes a simple but effective system of database versioning, where a database, at any version, is produced by running a series of immutable migration files. If you have only one development database, if you do not allow more than one developer working concurrently on the database, and you preserve every migration, then it all works in an uncomplicated way. However, for larger databases, development isn’t like that. For a start, there will be not one but many developers. At some point, we must extend Flyway to be able to support a team of developers working concurrently on several schemas and databases, at different versions.

The Flyway Teams edition adds features to help with this, as well as to give developers more flexibility in the way they work. Ironically though, this extra flexibility comes at the cost of allowing developers to ‘change the past’, contravening Flyway’s law of migration file “immutability”.

For example, it allows developers to provide an undo script for every migration, so they can slide back down a version to correct a mistake to the latest migration. It also supports baseline migrations, so developer can build a particular version of the database with a single script, rather than run the whole series of migration files up to the required version. Standard features of Flyway such as a ‘Repair’ command or the ‘Clean’ command will also allow a developer to tinker with the existing migration chain for a database.

Consider the use of undo scripts. If you have a sequence of migrations v1-v2-v3-v4, a developer can “undo” to v1 (by running the u4, u3 and u2 scripts), correct the v2 migration and then rerun the v2 to v4 sequence. Once the migrations can be rerun from the point of alteration, this will validate it for the database being built. This is all easily manageable when there is only one copy of the database being developed. However, as the Dev team grows, several databases will be built from the same source. Now every developer using that source will need to be able to “undo” to v1 and rerun the sequence, or simply perform a ‘clean’ and rebuild their version from scratch.

There are other possible complications for team-based development too. For example, if there were existing baseline migration scripts for v3 or v4, then these would also need to be updated, otherwise new databases that are created won’t contain the v2 changes. They may also fail to execute subsequent migrations because their source is different so tables or views might have changed, and procedures may be different.

Why then, you might wonder, does Flyway Teams even allow you to change past migrations? The reason is that it is only in certain parts of the development cycle that immutability matters, and at other points in the process you really need to alter migrations easily. A small change within the release process can lead to catastrophe, so preserving the immutability of the migration files is important in the Production (or ‘main’) branch, as well as in Develop.

However, while developers are working in isolated branches, use of features such as undo scripts and baseline migrations gives developers the freedom to experiment, try out wild ideas, and alter ‘baked in’ database structures. Flyway accommodates both extremes.

Limitations of the shared development server

Even though a relational database is intrinsically a multi-user system, the practice of more than one database developer sharing a development database and building on it can shock a procedural coder. However, it is perfectly safe if the users are not working concurrently on the same objects. Any conflicts must be dealt with straight away, and each developer needs to know what the others are doing and what objects are likely to change. The system I suggested for ‘managed’ development in the previous article works perfectly well in these circumstances.

However, for a larger corporate database, with more developers, and stricter controls in place for auditing and coding quality, it starts to cause problems. For example, how can a team of developers work on a shared database in way that provides an auditable trail of who made what changes and when?

One option that avoids branching is that they are restricted to the ‘offline model’ of writing a Flyway migration script and running it, for every change. However, this is unlikely to be accepted by the team. The alternative is that developer makes changes using whichever tools they wish, and then use a workflow like the following:

  1. Each developer selects just the objects that they have altered, and their dependencies, and generates a migration script for the changes, using SQL Compare
  2. They add the migration script to the Flyway directory with a ‘semantic version‘ incremented at the lowest level (usually patch), leaving a gap for luck
  3. Flyway is invoked, only once all the all the work is added for the next version and tested.
    1. Flyway executes all the migrations together, in a batch, so that any conflicts, usually caused by changed dependencies, that produce an error will result in a rollback to the previous version (the production version)
    2. If an error occurs, the team find the problem, among all the migrations, fix it and try again.
  4. Once the migration run succeeds, the team can:
    1. Check the resulting new version against the shared server to verify that all objects have been included
    2. Commit the migration scripts to source control. The Flyway schema history table will attribute all migration scripts to the person who ran the batch. To preserve a ‘who did what’ audit trail, each script will need to be manually assigned to its originator, at the point of commit.
    3. Create or update the object-level scripts for source control

If the work to be allocated is separated out to minimize dependencies, then errors should be kept to a minimum. However, the main problem with the strategy I’ve described is that it is still impossible to plan the release of features and experimental components of the database. It also becomes more difficult to have variants of a release, such as for different versions of the RDBMS, or for regional or cultural variants of the application. Testing must be done after all migrations are applied, which can be more difficult.

This sounds complicated, but without sticking rigidly to this sort of exercise, it is very difficult to be certain who did what and when without using Extended Events or SQL Audit in SQL Server.

Moving to a Branch/Merge model

If each developer works on an independent copy of the database, the process is similar but becomes simpler because all changes in the database belong to that developer. The obvious advantage is that it allows the development effort to be compartmentalized into task-based branches, with some developers working on feature branches, and others doing bug-fixes or running tests. A downside of this way of working is that it takes longer for conflicts to become apparent, and developers can no longer easily detect changes made by others that could affect their area of work.

Flyway Branching  and Merging Workflow

My article, Flyway branching Walkthrough proposes a way to organize development work into task-based branches, each managed by Flyway, and demonstrates a how to implement this workflow using my FlywayTeamWork PowerShell framework

For the larger development it is better to think of all development work being within an isolated branch, initially from the ‘Develop’ branch which is ultimately a branch of the production database. The only visible migrations in ‘main’ (a.k.a. ‘master’ or ‘production’) will be those with the release versions and patch versions as their destination.

You create branches to isolate your code changes, which you test before merging to the parent branch. Normally a merge is related to a definable job of work, such as a business function, an application-interface, a reporting module: who knows, but it is reasonably isolated.

When using a Branch/Merge approach with Flyway, each branch must have its own database that represents the state of the branch. More unsettling for traditional Flyway users, each branch needs its own ‘foundation myth’ or chain of migration files. This means that, when a branch is created, the branch must start with either a baseline migration, or an initial migration that takes the branch’s database from zero to the exact version of the parent branch’s database at the point of branching. Once the branch is set up, the developers can work in a shared environment within that branch as they probably once did. If more than one developer is working on a branch and they wish to work in isolation, they can do so by having one branch per person, while remembering that they face another merge.

Branch-based development

In effect, each branch requires access to three copies of the database while branch development takes place:

  1. The branch database for development
  2. Either a live database or a model of the database as it existed when the branch was created
  3. The latest live database for of the parent branch (Develop, in this example).

Each developer uses the latter two only for reference and never writes to them. At the point of merge, they will also need a private build of the latest version of Develop for testing purposes.

branching and merging in database development

This allows a programmer to work in isolation but also be aware of changes committed to the parent branch. If, for example, you were working in a branch to create a set of functions to support the management of people (customers, employees, contacts and so on), then you’d keep a close eye on changes to the parent branch for the appearance of changes to the base tables.

Since every branch has its own database, with its history held within the database, this allows full use of ‘undo’ within the branch (see, for example, Automating Flyway Undos) and all the other Flyway Teams features that allow more efficient development but can potentially defeat strict versioning.

At the point of a merge back the parent branch (Develop), the deliverable is not the chain of migrations used to develop the feature in the branch, but instead a newly created migration script that takes the database built from the current state of the Develop branch, as represented by its chain of migration files, through to the merged version. The Develop branch should only have the history of development version, never the migrations of the feature branches. Similarly, the production branch will only have the history of the production version. This allows features to be added when it is appropriate to do so rather than when they are finished. It also greatly reduces migration-file-clutter.

The merge

When the work within a branch is complete the developer makes a pull request, and this will trigger a team review of the feature. To review the proposed changes, they will want to see a script that would take the database from what existed at the point of branching to the current state of the branch database. If the branch has adopted a strict migration development approach, a Flyway ‘dry run’ script would provide this, together with the comments. If, on the other hand, you’ve adopted a more eclectic approach to the branch development, you can use SQL Compare to produce a synchronization script that uses the model of the parent branch database as source and the current state of the branch as target.

Either type of script will show only the changes made by the branch, and the database objects affected. Before review, this script will need to be liberally documented because it will form the basis of the permanent migration script that represents the branch.

Once reviewed, the team will decide whether to merge the feature back to its parent branch, and if so when. The merge script will be a permanent migration in the parent branch. There are four issues that most frequently cause a merge operation to fail, all of which really boil down to ‘dependency problems’:

  1. You’ve altered an object without checking whether it had dependent objects that would be affected by your change.
  2. Changes made to the same objects, in the parent database, are incompatible with changes made in your branch.
  3. Changes have been made to an object in the parent database, on which a new object you’ve created in your branch depends.
  4. An object was created in the parent database with the same name as an object you’ve created in the branch.

There are, in fact, many other things that can ruin a merge, sometimes in subtle ways that are hard to debug. An eyeball check at the merge review will help greatly with this. For the ‘Fiddly Four’ reasons for a failed merge, several RDBMSs support ways of running checks on the parent database to make sure you can avoid them, such as running dependency checks.

To minimize the possibility of merge failures, you need to run some thorough “pre-merge tests”. You run the merge script a private copy of the current parent database (i.e., on the Develop branch database, as it currently exists). Once you get a successful migration you will then need to run sufficient integration tests on the altered database to be confident that the changes haven’t broken anything.

Once all this is done, the branch developer can use Flyway to migrate the Develop database to the new version and commit the changes to the Develop branch. As part of this process, they can generate other required build artifacts for the new version, such as object-level scripts. The child branch may be removed after the merge, or it may continue. The develop branch is permanent, as it the main (or production) branch.

The branch and merge may seem complicated, but the upside is that the exact same process is applied across the entire tree, including patches or releases into the production branch.

Moving to Continuous Integration (CI)

A CI approach becomes possible once a team have adopted the Branch/Merge level. To move towards a more rapid integration, the systems developed in the Branch/Merge level need to be automated and made more visible across the team. This requires such tools as issue management, effective messaging, and a system for publishing reports on progress.

The start of a deployment workflow is triggered after a successful merge from the develop branch into the production branch. A new release appears in the production branch. In a CI approach this automatically triggers a migration, running the migration script for the new release on the copy of the production database used by Flyway’s production branch, followed by automated testing.

The coverage of automated testing needs to be as comprehensive as possible because a production system can be brought to its knees by a performance problem, deadlock or resource leakage as well as a bug, or a mistake in a business process.

With rapid releasing, the ‘visibility’ of the process across the team needs to be sufficient to warn specialists of their involvement and the likely scale of their work required. If a release requires security clearance, for example, the experts will need to know in simple terms what has changed. Any acceptance process will require that business analysts are clear about the repercussions of a release.

At the CI level, any manual process raises a question mark. Can it be scripted? Flyway Enterprise (a.k.a. Redgate Deploy) comes into its own at this stage. It provides the tools to run many of the routine tasks that are required when delivering a new version, such as producing change reports, an audit log, running code quality checks, drift checks and so on. By using script callbacks, the team can automate all these tasks.

Flyway, as a CLI tool, will also work easily with build systems, workflow applications, issue management and a host of other tools that aid the checks, workflow and reporting that underly a DevOps CI development.

Summary

To support database Branch/Merge and CI, Flyway becomes increasingly a component in a development environment that integrates several supporting tools. Flyway can manage single-handedly a single-developer database that is embedded in a single application. However, an organisation-scale development requires an approach that integrates the work of several specialist tools and encourages automation. Flyway, a cross-RDBMS CLI tool with its speciality of managing the development of database migrations, fits in well.

 

Tools in this post

Flyway

DevOps for the Database

Find out more

Flyway Enterprise

Standardize Database DevOps across the organization

Find out more

Flyway Teams

Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.

Find out more