Product articles Flyway Database migrations
Flyway’s Validate Command…

Flyway’s Validate Command Explained Simply

The Validate command aims to ensure that Flyway can reliably reproduce an existing version of a database from the source migration scripts by warning you if files are retrospectively added, removed or altered that would prevent it from doing so. Validation errors are Flyway's warning that "the source for this version has changed".

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

What does the Validate command do?

The Validate command aims to make sure that the database can be reliably reproduced from its  source scripts. It raises errors if it detects changes to source migrations, such as a file being retrospectively added, removed or altered, that mean it can’t reproduce an existing version of the database. Flyway compares details of the source migrations files to those already applied to the database and detects differences. The checks run automatically on Migrate, or you can use the Validate command.

It uses the information stored in the Flyway schema history table, maintained by Flyway in the default schema of the connected database, to validate the sequence of migrations that have already been applied to produce the current version. It will attempt to ensure that:

  • None are missing – either because an applied file has been deliberately removed or renamed, or because Flyway can’t currently locate it.
  • No migrations have been added to the sequence – it checks for any migrations with versions below the current schema version that have been injected into the sequence, retrospectively
  • None have been altered retrospectively – it verifies that the checksum value stored for each applied migration file is the same as that calculated for the current file with the same name, in the migration locations. The checksum used is CRC32 for SQL migrations

Validate can also check to ensure that your filenames are compatible with the flyway conventions if you set the parameter -validateMigrationNaming to true (it is set to false by default).

The migrate command will run the validation process automatically, and error-out of the process if violations are detected. Therefore, before running the migration as an overnight process, it is better to run the validate command beforehand to prevent a release being aborted. Although the info command will do a validation, it only does so to update the current status of files, such as flagging files as ‘missing’, or ‘ignored’ if they’ve been inserted into the sequence retrospectively.

Why the need to validate?

Flyway aims to help developers write and test the code that will migrate a database to a requested version, reliably and repeatably, making it more likely that code prepared in development will, where necessary, work reliably to update a production database. It removes the need to make and test special-purpose migration scripts as part of the deployment process, instead using a ‘run’ of migration files in the correct version order.

To do this, it needs to help the team avoid, as far as is possible, version inconsistencies, where a Flyway project produces multiple copies of a database at the same Flyway schema version but with different schema objects. By avoiding this, the team can have much greater confidence that the version of the database they are working with, whether it is being used for bugfixes, development, signoff testing or deployment, is at the version that they think it is, and that the version can be reproduced by Flyway’s current migration scripts.

Flyway Validate warnings serve as a ‘red flag’ that potential version inconsistency has been introduced so the team can investigate and fix it. In the following example:

  1. A Dev database has been created by Flyway using the original sequence of migration files, V1 to V4.
  2. A developer makes retrospective changes to the source migration files, altering one and adding a new one.
  3. The same project is used to migrate a Test database to the same version, from empty
  4. We have version inconsistency: two databases at the same Flyway schema version but with different schemas.

Validation to help avoid Flyway version inconsistency

Flyway can’t avoid this situation completely, but the Validate command can make you immediately aware of it, so you can fix it. In the above example, once the source scripts are altered retrospectively, in step 2, subsequent attempts to Migrate the Dev database will fail with validation errors, the nature of which depends on the problem that was detected. This is Flyway’s warning that “the source for this version has changed“.

I must emphasize that although Flyway’s Validate command encourages you to maintain the integrity of versions, it doesn’t enforce it. It merely gives you the tools to do the right thing. Flyway provides the means to subvert this integrity because the designers of the application realize that there are special occasions where developers must make retrospective changes to the files that constitute a database migration.

How Flyway detects and reports validation errors

The Schema History Table is the essential part of the validation process because its information allows it to verify the continuing integrity of the migration sequence. Although the Validate command will run when there is no Flyway schema history table in the current database, the database has no set version and so Flyway can, in this case, do little more than check that it can connect to the database and access the migration files specified by the configuration.

Flyway will raise validation warnings for retrospective changes to any versioned migration files (including baseline migrations and undo migrations) and to ‘repeatable’ files. The status of files after a validation warning can be ‘Missing’, ‘Pending’, ‘Ignored’ or ‘Future’, depending on the problem detected.

Missing or renamed files

Flyway Validate first checks that the migration files used to create the database and recorded in the Flyway schema History table still exist in the migration locations. If the missing file is older than the most recent file Flyway can find, it labels it as ‘Missing’, if it’s newer it labels it ‘Future’.

Files go missing for various reasons, either accidental, such as due to a network issue, or on purpose, perhaps because someone has decided to concatenate several migrations into one. Sometimes files can simply appear to be missing just because you’re running commands from the wrong working directory. In the following example, Flyway info returns the state of a file as Missing:

flyway migration file missing

If you subsequently run Migrate (or Validate), Flyway will raise a validation error:

Other times, when you need to change the description of what a migration does, you are obliged to make a change to the name of a migration file, because both the type of migration file, and the description of what it does are embedded in the filename. If any part of the filename changes, Flyway will inevitably assume that a file has gone ‘missing’, and a new one has been added.

As you can see, in either case, Flyway will raise a validation error, unless you pre-emptively run the Repair command to tell it that this was deliberate (which changes the status of these files from missing to deleted).

New files added with version below current schema version

Having ensured no missing files, Flyway Validate then checks that no extra files have been inserted into the migration files with versions below the current schema version of the database. As illustrated earlier, these would potentially alter any new database created at this version from the same project. Flyway migrations only record the end version, not the start version, so would otherwise be vulnerable to this type of uncontrolled change.

Flyway flags any files injected with a version number below the current schema version as ‘Ignored’. In the following example, we have an existing database at V1.1.2 and someone has injected a V1.1.1.2 file into the existing sequence. The output from the Flyway info command shows that this file has a State of Ignored:

Migration file ignored

If we now attempt to migrate the database to V1.1.3, Flyway will raise a validation error:

The introduction of a new file of a version lower than the current version is referred to by Flyway as one that is ‘resolved but not applied’. It most often happens when someone uses Undo migrations to revert their branch copy of a database to a previous version and introduces a patch to correct a bug, forgetting that other existing databases are at a higher version.

Flyway Validate will also warn of migrations that are resolved but not applied, just because they are still pending, though these will not result in validation errors when running Migrate.

A change of content in migration files that have already been applied

Having ensured that the available migration files below the current version of the database are the same name and quantity as the ones used to get to that version, Validate then needs to check whether any of these files have been changed.

To do this, it tries to match the checksums of all the files in the migration locations with the checksums already stored for that file in the schema history table, when the current database was created. If it detects a change, Flyway will raise a “migration checksum mismatch” validation error when you run Migrate or Validate:

You may have made innocent changes to a file, without realizing that it would upset Flyway. You may have added comments and documentation. You may have reformatted the text to make it more readable. You may have added data INSERTs that wouldn’t affect the metadata version. It is quite possible that you’ve added DCL SQL commands to define access to metadata, which will be different in each database environment. Flyway is like a guard-dog with limited knowledge. It may growl at the postman but you’re grateful for the growl when it is an unwelcome intruder or a deer eating your vegetable garden.

Although it is sometimes justified, code that changes the metadata of a database retrospectively is not a good sign, even if the database that is the end-result of the process is unchanged. It means that you must check out the changes to be sure that they are safe. If you are using Source Control to monitor these migration files, then it will be immediately obvious if a foolish code change has been made, perhaps one that is made by a developer who has watched ‘Back to the Future’ one too many times. You restore the old version and enshrine the change in a new migration file.

Responding to validation errors: what next?

You can’t ignore a validation failure because, as has been demonstrated, the checks are done before other Flyway commands such as Migrate, so will cause migrations to fail, or for files to be ignored. The Validate command allows you to catch them before then.

flyway validation errors

So, you have a validation error. What do you do? First, you need to investigate why validation has failed. Start by ruling out silly errors. Have you correctly specified the list of paths to directories that contain your migration files? Have you connected to the correct database?

If the configuration is correct, you need to put things right to restore ‘version integrity’ of the migration chain, in other words to make sure that the result of the migration chain has identical tables, functions and other metadata for all database copies that are the same schema version.

Next, depending on what problem occurred, you may decide to run a “big Clean”, or fix the schema history table with a Repair to allow Flyway to perform migrations once more. You may even decide to change the configuration to instruct Flyway not to validate certain files such as the ‘repeatable’ migration files.

I’ll spell out the six main alternatives.

The Big Clean

At the start of the project, especially in a feature branch created from baseline migration file, a team can just do a complete rebuild by cleaning the database and redoing doing the entire migration run.

For example, cunning database developers can use the ‘Big Clean’ trick to condense a vast multitude of insignificant migrations that have lost relevance as individual steps, into a single enormous step, possibly generated from a build script. If they run Clean first, Flyway never knows, as far as that copy of the database is concerned, so no files will be listed as “missing in action”. To save a bit of time, you can even set Flyway to automatically clean and rerun the migrations if it hits a validation error. This can save quite a bit of scripting, but I’d prefer to find out why, first.

This is all fine in an isolated featured branch if you’re condensing files that aren’t yet shared with other branches. However, elsewhere, clean with caution: with great power comes the risk of great disaster, because after a Clean command, Flyway must rebuild the schema history table from scratch. All it knows is what the resulting version should be, from the name of each migration file. It doesn’t know the version of the database it expects to start from, or whether a file was deleted or altered or added. It can lead to version inconsistency, as illustrated earlier.

Repairing benign changes, and intentionally-deleted files

Flyway will perform a validation before a Migrate or Info command, and it will refuse to do a Migrate if it finds a change to a successfully applied migration. You need to correct any intentional changes and alterations to the run or migrations that led up to the current version.

If, for example, you reformat all your SQL code from your migrations to make it easier to understand, you then must ensure that all database instances that use these migration files have their checksums refreshed to the new correct values. The Repair command does this for you. It will also update the descriptions and types of the applied migrations in the schema history tables with the ones in the filename of the available migrations. It will mark all the missing migrations as ‘deleted’. If you are working with a scripted process, it might be best to run a preliminary check to make sure that there are no changes that might cause an error with a migration.

Fail On Missing Locations

It can happen. You add a list of locations or paths to your migration files to your project configuration file. Flyway looks in each path in the list to see if there are any relevant migration files. It first checks whether the location exists: if it doesn’t, then it shrugs, sends a message to STDOUT, and just goes on to the next location. It doesn’t produce an error.

If a location has gone missing due to ‘the gremlins’, then it is likely some of the files that Flyway knows about are missing too, so for existing databases you’ll subsequently get notified with a validation error because Flyway needs to check that old migration stems haven’t been tampered with.

However, Flyway’s default approach of ignoring missing file locations but then subsequently raising validation errors on existing databases, means that it can’t prevent version inconsistencies, because for new or cleaned databases there is no schema history table, so Flyway can’t perform validation.

If a migration location is now missing, perhaps because it was renamed or has vanished from the network, Flyway will just move on to the next location and run the migration to reach the requested version. The most likely consequence is a SQL Error, unless you are storing the migration files for each different schema in different Flyway locations, in which case you may have a rather different, slimmer, database than the one you expected, but without knowing it.

For all these reasons, yes, I advise you to set the failOnMissingLocations parameter to true.

Disabling validation on a migration

By default, Flyway runs Validate automatically on every migration run. If you have many migration files, this can take a very long time. If you’ve already validated, and want to do repeated migrations, it makes sense to do the subsequent migration runs without the repeated validation, though Flyway still must access and ‘resolve’ all the files. This type of shortcut is fine and sensible with a scripted task that involves several Migrate operations, but the parameter must be used responsibly.

Ignoring Flyway’s warnings by tweaking validation rules

You can switch off all or part of the validation process. It sounds a reckless thing to do, as an alternative to a Repair, but the facility was put there because someone needed it badly. The status of files after a validation can be ‘Missing’, ‘Pending’, ‘Ignored’ or ‘Future’, and you can specify any of these to be ignored, though ‘Future’ is ignored by default.

You can choose to ignore a particular file or files using the -ignoreMigrationPatterns configuration parameter.

Tweaking the configuration using parameters

I’ve already written in How to Fix or Avoid ‘Ignored’ Migrations in Flyway about ways of persuading Flyway to execute migration files out of sequence, or to ignore changes in files, so there is no need to repeat these approaches here. They are very useful and have their place.

Conclusions

The Validate command provides a way of checking that the migration files you’ve specified in your Flyway Configuration will recreate the development database you are using. Flyway won’t do a migration without first validating these files, so it makes sense to run a separate validation up-front before a long process to be more confident that nothing will trip it up, and it is likely to produce a valid copy of the version assigned to it.

There are plenty of perfectly sensible development activities that can cause a problem to a Flyway migration. In my case, adding documentation, or formatting code to make it more easily read, will cause Flyway to frown unless you take steps to subsequently repair the Flyway history tables in all the copies of the database being used. Therefore, it is best to run a validation occasionally to get an idea of possible issues to a migration. Normally a migration run is done unattended, so it is best to fix any obvious issues before then, especially the subtle ones that end up with files being ignored rather than triggering an error. It is always possible to end up with a copy of a database that has subtle differences to what is in the Main branch, and that could spell trouble ahead.

Tools in this post

Flyway

DevOps for the Database

Find out more