3 December 2021

A Database DevOps Workflow Using Flyway Enterprise and Redgate Test Data Manager

Tony Davis describes a typical database development cycle and deployment pipeline supported by Flyway Enterprise and Redgate Test Data Manager. It allows branch-based database development, using disposable databases (clones) and version control tools, promotes continuous integration and testing of changes and automates the build and deployment processes so that they are repeatable, fast and reliable.

Flyway Enterprise and Redgate Test Data Manager are a set of tools and services that will help you implement a controlled and automated development and deployment system for database applications. This article describes one example of a typical DevOps workflow that this supports, suitable for managing the database development cycle and deployment pipeline of any database system. It explains where the various tools slot into it and how they help simplify, optimize and automate many of the required tasks.

The tools and services in Flyway Enterprise and Redgate Test Data Manager are designed to be used and combined in several ways, depending on the team’s chosen development strategy, and we encourage you to adapt this workflow, as required. Please read my previous article, What is Flyway Enterprise?, first, if you’re unfamiliar with it.

An overview of the workflow

The proposed workflow splits development work into task-based GitHub branches and maintains strict control of database versioning. It promotes merging and testing of database changes, with automated build and subsequent tests to prove that the database fulfills all requirements.

It allows controlled team-based development, where there is never any doubt as to the exact version of any copy of the database, and where all changes are verified, tested and inspected frequently. The result is reliable deployments, where the team can then be confident that what is delivered is only what has been tested, inspected, and signed off for release.

The following figure shows the high-level view of the proposed workflow. It allows for the continuous delivery model, where every new database version is available for release but also recognizes that, for many teams, development and test is a cycle, where they select a version to be a release candidate at the right time.

Diagram Description automatically generated

Flyway Enterprise and Redgate Test Data Manager offer full support for this workflow, with tools and services to control and automate every stage in the development and delivery pipeline:

Self-service disposable databases – the team provision disposable copies of the latest database version for each branch of development, as required to fulfill the features requests and bug fixes that define the current sprint. Data virtualization allows fast delivery of these databases with minimal local storage overhead.
Version controlled team development – the team use change control tools, in tandem with the version control system, to maintain strict control of database versions, and to autogenerate migration scripts for deploying changes, and object-level state scripts for tracking what objects were changed between versions.
Continuous merging and testing of changes – change control, schema comparison and data virtualization techniques simplify the generation and testing of merge scripts, promote team-based code reviews and so help minimize ‘breaking changes’ during merges. As each branch is merged, tested, and committed to source control, this will trigger the build of the next version
Automated CI builds and migrations – change automation tools will automatically build each new version, incorporating DevOps best practices such as build verification, code analysis, automated testing and drift detection, to check for any problems
Controlled deployments – Deployments during the release process are automated in the same way as during development, the idea being that you are practicing and refining real deployments from as early in the development cycle as possible. Flyway Enterprise provides simple reports of what has changed, plus the generated scripts, warnings, and test results for detailed inspection by others.

By adopting a disciplined, and where possible automated, approach to these tasks, you will deliver more reliable databases, faster. Flyway Enterprise simplifies many of the scripting tasks involved in development and deployment of a database, to make them repeatable and reliable. Once each step in a process in understood and controlled, it allows the team to introduce appropriate automation and testing, to make the process easier, faster and less error prone.

The build and deployment tools in Flyway Enterprise can be used with CLI scripting, but also provide built-in modules to automate the way you build, test, and deploy databases. The tools are designed to fit, and co-exist with, the application’s existing development toolchain, source control, build, integration, tracking, and reporting systems. The build and deployment components come integrate directly with any major build or continuous integration system, including Jenkins, Azure DevOps pipelines and many others.

Stage 1: Database Provisioning

A sprint starts from a tested build of the latest version of the database. In a continuous release pipeline this will always be the current production release. In other cases, it might be the latest development version, in the “Develop” branch in source control. Teams will also occasionally need to provision copies of older supported versions, such as to investigate a reported bug.

Requirements

The initial database provisioning process generally entails creating a ‘reference’ version of the current production release, for use in development, and then distributing copies of it to the developers and testers.

The schema of the reference copy should be identical to production, but there might be production-only features such as replication or encryption, as well as users and access control settings, that need to be omitted. The reference version might contain a full copy of the production data, if it’s permissible, but otherwise, any personal data will need to be anonymized and other sensitive data masked.

Once the reference copy is created, you can, depending on how you do the build process, export the reference data files, ready for rapid importing during the automated build (stage 4, described later). Alternatively, the team can generate and store their own test data sets, and bulk import the data during the build process, to recreate the latest production, or development, version.

Individual copies of the latest version will then need to be rolled out to create or update as many databases as are required for development, or for various types of testing. The reference copy of the latest production release serves as the ‘baseline’ for each of these development copies, for generating migration scripts that wall ultimately deploy the next version.

Common provisioning problems

With a large application, it’s quite common to have multiple development teams working on different features in parallel. Teams often exploit the branching and merging capabilities of Git for scaling up team-based development in this fashion. However, it’s also common for the same teams to be restricted to using a single, shared database environment, for all development, which makes branch-based development at the very difficult.

The choice of a single shared instance is usually a logistical or security decision. For example, if you have a very large database, maintaining any more than one or two development copies takes too much time and management and requires too much disk storage space. Similarly for complex Enterprise applications supported by many interdependent databases; supplying individual, dedicated copies of this system is impractical.

Database provisioning with Redgate Test Data Manager uses data virtualization, or cloning, technology to remove these problems. The previous workflow diagram shows provisioning a dedicated database for each branch of development, as required to fulfill the features requests and bug fixes that define the current sprint. The database becomes a ‘disposable asset’ that can be created, tested and reset, repeatedly, as part of a Continuous Integration pipeline.

Regardless of how many copies are required, the team need to create only one full copy of the database (the data image). Developers can then create, refresh or reset copies, or ‘clones’, of this image, on each development and test server, in seconds. Each clone shares access to the same data, in the shared image, and stores locally only any subsequent changes made to the clone. This means that each clone requires minimal local storage space on the development or test server. For a SQL Server example of how this can be set up, see Provisioning development with the latest build using clones.

You can also create data images based on multiple databases (from a list of backups in a YAML file) making it possible to create dedicated development environments when dealing developing applications supported by multiple interdependent databases.

Stage 2: Version-controlled team development

Once the database provisioning process is complete, development work can begin. As discussed in Stage 1, Redgate Test Data Manager enables dedicated development databases, per branch, rather than have all developers access a single shared database. This has many advantages for team-based development.

As each developer makes, tests and commits changes, on a branch, Flyway Enterprise will auto-generate in source control all the migration scripts required to upgrade existing databases to the new version, as well as maintain the latest schema model (object level scripts).

Task-based branches

Through its support for data virtualization (see Stage 1) using Redgate Test Data Manager, Flyway Enterprise enables a DevOps development process that splits the work of a sprint into task-based branches.

Each branch will have its own copy (clone) of the latest version of the database, and each developer will make and test changes to this copy and commit all work to the branch. Each branch will also have access to a database representing the current version of the parent branch (e.g., Develop or main), as it existed the point the branch was created. Typically, there would be one developer working on each branch, but Flyway Enterprise also supports automatic branch switching, to allow collaboration.

Task-based branches prevent the common problems of one developer’s database changes interfering with another developer’s work, or of one developer accidentally reverting changes made by another. It also gives each developer a much greater degree of freedom. It makes it easier for a developer to perform development U-turns and encourages experiment and a more creative way of solving problems. Test harnesses are easier to create and branch development gives each team member much more freedom in running potentially disruptive tests. Finally, makes it easier to generate a script of each individual developer’s changes.

Continuous testing

Flyway Enterprise and Redgate Test Data Manager support continuous testing, throughout the development cycle. During branch-based development, the team can build unit tests to verify individual functions and procedures using built-in SQL-based unit testing frameworks. These tests are added to suite of regression tests that run automatically to test every new build and ensure that the whole system behaves as it should.

Often, tests won’t pass first time. There are usually a series of failures, corrections, reruns, until the code passes. This can mean creating and resetting a database many times, a situation for which use of clones is ideally suited. Each test run requires a set up that creates a copy of the database at the right version, including the data, then runs the test, assesses the result, and finally tears down the test set up, resetting the database to its original state, ready for the next test. See Unit Testing and Integration Testing with Clones for an example of how this can work, when using Redgate Test Data Manager.

Committing versioned changes to source control

Each developer commits tested changes to the branch, in their local source control repository. There are several ways of saving these development changes ways, and generally of managing the source code archive of a database:

Commit migration scripts (migration-based change control) – these scripts describe the progressive addition and alteration of database objects and data, between versions, and are applied by executing them in order.
Commit the ‘static’ object creation scripts (state-based change control) – maintain the latest CREATE script for each database object, and periodically generate build and migrations scripts from the individual object files using schema comparison
Save the build script for each version – which will create all database objects in the order of their dependency. This will include saving the data sets for each version so that a clean build can be made that includes the data.

Flyway Enterprise is optimized for migration-based change control, but fully supports the other approaches too. When working on their development databases, the team can use Flyway (command line or desktop) automatically generate versioned migration scripts describing each committed change, which can be committed directly to the source control system. The following simple example shows how development of a new “tagging” feature has resulted in three new versioned migration files in the branch folder of source control. These changes will also be reflected in updates to the schema model (object level scripts):

version-controlled database development

These versioned migration scripts will be the source for subsequent deployments and is the recommended method to move a release candidate through the deployment pipeline, when upgrading of a production relational database of any size. It allows team to update the live system, without downtime.

Stage 3: Merging and Testing of changes

As work on each branch of development completes it must be merged, in turn, into the parent branch, tested then committed, which will trigger the build of the new version (Stage 4).

Flyway Enterprise will help simplify this merge process and allows the team to perform a pre-merge build and test, to avoid the “breaking the build” when the changes are committed.

Simplifying merges

When a developer completes work on a feature branch, he or she issues a pull request to merge the all the committed changes to the parent branch. At this point, there ought to be a team-based review process, to inspect the proposed changes and look out for any potential conflicts.

When using a migration script approach to track development changes in each branch, successively merging multiple separate ‘strands’ of versioned migration scripts can be a difficult task. See Branching and Merging in Database Development using Flyway for a description of some of the challenges.

To simplify this operation, the team might consider condensing the chain of migrations that define the branch changes into a single migration file. In other words, the versioned migration chain for each branch is considered ‘ephemeral’, like the branch itself, and what’s committed to the parent branch (the Develop branch, say) is a single versioned migration file, tested to ensure that it will safely merge in the feature.

Flyway Enterprise offers several features to help with this. For example, a Flyway “dry run migration“, applying the branch changes to a clone of the parent branch database, as it existed when the feature branch was created, will produce a single script of all the changes for the team to inspect.

If the parent branch is still in the same state as it was when the feature branch was created, then the merge should be relatively simple. More often it will have moved on, as other work has been committed into it, in the meantime. To produce a first-cut migration script for a merge, in such cases, the team could exploit Redgate Test Data Manager’s data virtualization (clone) capabilities and Flyway Enterprise’s schema comparison features, in tandem.

For example, the team can spin up two clones of the current parent branch, apply to one of them all migration files added to the feature branch, and then use schema comparison to compare the original and migrated clones. This will provide a summary list of all the changes, helping to identify differences that are likely to cause conflict, and generate a first-cut migration script to merge all branch changes back into trunk, safely. It will generally preserve the data held within the target database or will issue warnings when it needs to be done manually, in which case you’ll need to provide the additional change scripts required to preserve existing data.

database branching and merging

Pre-merge build and test

The team need to test the merge works as expected, making all the required schema changes correctly, while preserving existing data. With Redgate Test Data Manager, the team can deploy a clone of the current parent branch and then use Flyway to apply the migration script to build the new version, including the new feature or bugfix, and run some tests. The team can also do some code quality reviews at this stage, using auto-generated static code analysis and detailed change reports, produced by Flyway Enterprise.

Some basic integration tests should be devised and run after merging changes to minimize the risk of “breaking the build”. Again, all tests are added to suite of regression tests that run automatically to test every new build and ensure that the system behaves correctly (stage 4). If all the tests pass, the ‘merge’ migration script can be committed to the parent branch, in the shared source control repository and this will trigger the build of the next version.

Stage 4: Continuous Build/Integration

In the example, workflow, each commit to the parent branch triggers an automated database build process to prove that the ‘raw source’ for any version in source control can be built without errors to produce a working database. Subsequent automated regression and integration tests will prove that the database satisfies all functional requirements. This stage of the process should be fully automated, with no human intervention required.

If the changes pass all tests, they can be delivered automatically to the release branch in source control, to be deployed to production environment by the operations team (continuous delivery) or even automatically deployed to production to be immediately usable by customers (continuous deployment). Alternatively, if there is more work to do before a release, the team can create a new data image from the successfully build and distribute disposable copies of the new version fort the next sprint.

Automated build pipelines

Flyway Enterprise provides a consistent and automated way to build any version of a database in source control, regardless of whether it’s required for some ad-hoc development work, or to prepare for a release. The build process can either build the database from scratch (empty) to the required version or migrate an existing database from an earlier version to the new one. In either case it will ensure that the metadata and static data in the database is at the right version.

Before distributing copies of the new version, you’ll also need to supply data, either generated or anonymized, that conforms closely to the real data, in style, distribution and characteristics. Redgate Test Data Manager can assist with this and includes data subsetting, masking and generation among its capabilities. It pays to start with a common version of the data, as well as the metadata. However, this data takes time to develop and prepare, and it is likely to change as the database application progresses. It makes sense to do this once, after each build or migration that affects the data, and before provision to the team.

When migrating an existing version, Flyway Enterprise will first perform a database drift check, comparing the schema of the database as it is now, to a ‘snapshot’ of how it looked when the migration script was created, or a reference database at that version. Uncontrolled changes can cause security problems and deployment failures, if undetected.

It will then automatically run migration scripts in the correct order and track which scripts have already been executed one the target previously. When Flyway migrates an existing database (at V1.1.0, say) to the new version, it ‘version stamps’ the migrated database and records what files were used to build the version, who created which version, and when. This strict model of database versioning means that developers are always clear and confident of the version of a database. This also makes it easier to coordinate database upgrades with changes to each associated application.

The build process incorporates DevOps practices such as build verification and static code analysis and produces detailed change reports, for inspection by the team. This will help improve the quality of the changes delivered and reduce the likelihood of a failed build.

As part of this build process, Flyway Enterprise can also generate the other build artifacts quickly and easily, to support a variety of development and test chores. For example, it can produce the object-level source for each new version, making it much easier to investigate the history of changes to a table or code module, between versions. It can also produce build scripts, so that the team have a fast and simple way to support ad-hoc development builds, as well as change reports and code analysis reports.

Flyway Enterprise builds are automated using command line tools and scripting, and many teams incorporate this build functionality directly into CI pipelines, using their preferred build/release tools, such as Azure DevOps or Jenkins. See Example CI/CD Pipelines for details.

Integration testing

If the build completes successfully, and passes automated regression testing, it can be deployed for integration testing, which is an automated process that ensures that the database processes, end to end performs as expected. Flyway Enterprise can, for example, automatically deploy the newly built database version to a cloud-hosted container, meaning that that version under test can be quickly created and reset to its original state for each test.

In an integration test, an entire process is run from a known data set, and the results verified against a confirmed and agreed version. The test data should be realistic businesses data that conforms to experience but changes as little as possible because it must be cross-checked by the business to make sure of the validity of the result. There will be a range of other testing, depending largely on context, to check for such things as scalability, resilience and performance. See Unit Testing and Integration Testing with Clones for more details.

It is at this stage that exploratory tests are done to increase test coverage, and additional integration tests are created to check new functionality. Every time you hit a deployment problem you add a test to ensure it never happens again.

Stage 5: Deployment Pipeline

When all sprint tasks are complete and the team have tested and verified all functionality, and all required deployment scripts, then the current version can become a release candidate, and enters a process often called a ‘deployment pipeline’. A release candidate cannot change its version, and so must be protected from any further amendments, otherwise it will invalidate subsequent tests in the deployment pipeline. All feedback from the deployment pipeline identifies the release from its version number.

Deployments during the release process (to QA, Staging and so on) will be automated in the same way as they are during development, the idea being that you are practicing and refining real deployments from as early in the development cycle as possible. However, there will often be manual checks for security assessment and compliance, depending on the application and the data is uses. This is where the wider team beyond the development activity get to check what has been done with the release candidate.

Flyway Enterprise will help automate the delivery of the database materials that need to be processed or approved, for a timely release. Flyway Enterprise release artefacts provide everything required for the release, can be inspected without access to the underlying source, and cannot be changed. They contain detailed change reports that allow all participants to know precisely what has changed, when and why.

It also deals with the requirement to verify a deployment:

Automatic drift check before the deployment runs – by comparing to a ‘model’ of the target database, included in the deployment package. It is possible to use a schema snapshot, captured when that release was first deployed, to do the same task.

Post-deployment checks to verify that the deployment worked, and the result is exactly what was tested and signed off for released. Flyway Enterprise can do this by comparing the result of the deployment with the build version that got the sign-off. It can produce a report that proves you’ve run the necessary post-deployment checks

Summary

Flyway Enterprise and Redgate Test Data Manager are designed to support DevOps processes where the resource, in this case a database, can be spun up for a quick, specific task, modified as required, and then destroyed and recreated. They work together to use automation and scripting to establish a working build of the database quickly, integrate subsequent changes continuously, or as frequently as possible, and then provision disposable copies and runs tests to prove that the database fulfills all requirements. Over time, database changes become easier, and the database build and deployment processes get more predictable, and far less likely to introduce problems into the production systems.

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway CI pipelines
A Database DevOps Workflow Using Flyway…

A Database DevOps Workflow Using Flyway Enterprise and Redgate Test Data Manager

An overview of the workflow