Flyway for SQL Server Data Tools Users
If you are using SSDT for authoring, building, debugging, and publishing a database project, how do you change to, or preferably migrate towards, a Flyway-based database development? Flyway doesn't need to replace any code part of SSDT, but if allowed to manage every release candidate, it does allow for much cleaner branching, merging, and deployments.
For a SQL Server Data Tools (SSDT) user, Flyway is best understood as an addition rather than a replacement, causing just a relatively small change for the development team. The biggest change will be the process for merging the individual contributions to the work, but you gain a robust versioning system that prevents many of the difficulties associated with database development and provides a great deal more flexibility in building more complex systems.
The advantages of Flyway are discussed in detail in the following articles: Moving from ‘Chaotic’ to ‘Managed’ Database Development using Flyway, Implementing a Database Versioning System, Building a Database with Flyway and Managing database changes using Flyway: an Overview.
The SSDT development method
One of the reasons that SQL Server Data Tools (SSDT) is popular is that you don’t even need access to a development database on a shared server: you can create a local database project on your workstation, or in a Docker container, by publishing a DACPAC of the current development database. You can make these changes to the local database via scripts or a GUI such as a table designer, or SSMS SQL Editor.
You can even dispense with a live database altogether and both test and validate your database design ‘offline’ by ensuring that it compiles successfully. You can add, modify, debug, or delete the definitions of objects in the project such as tables, views or routines and at least be confident that they run.
Either way, local or offline, once you’ve tested your changes, you can produce a DACPAC that is then used to deploy the database schema to a target database instance, either on-premises or in the cloud. The DACPAC files can also be used to compare the schema of two databases and generate a script to update one database to match the other. You can either publish the changes to the target database directly from the UI, if you’re feeling brave, or extract the ‘incremental update’ or synch script for review.
A DACPAC file is a representation of a database and is proprietary to Microsoft SSDT products (SQL Server Data Tools) in Visual Studio, Azure Data Studio and Visual Studio Code (VS Code).
A DACPAC provides a way of preserving the state of a database without needing a live database. The script function of DacFX or SQLPackage will provide a script to update the schema of a database represented by a target DACPAC to match the schema of a source one (it cannot update a DACPAC directly). In this way you can, for example, use the Schema Compare utility to ensure that your local branch project stays in sync with the production database, so that production hotfixes are added to the local database.
Team development with SSDT
Developers can use the SSDT tools to create snapshots of a project (DACPACs, or various types of source directories) at any stage of the development cycle. On a local workstation database, a developer can start by building the latest development version from a DACPAC or SQL script. If the database does not exist on the server, the publish operation creates it. Otherwise, the existing database is updated.
Having, in effect, created a sub-branch of the development branch of your project, the developer can now work independently, or maybe with one or more others on a branch server, until they’ve tested the changes and wish to merge them back into the development database.
After completing work on a branch, they use the SSDT DAC tools such as Schema Compare or the command-line SqlPackage to capture the changes in a DACPAC and submit it to be merged into the development database. With SqlPackage, we compare this DACPAC (source) to the development database (target) and generate a SQLCMD incremental update script to make the schema of target identical to what is in the DACPAC.
The problem with this approach is that, since this is a database-level comparison, the generated script can easily end up unintentionally removing or overwriting any changes committed by other developers in the interim. If you have several individuals working independently, you will have to amend the incremental update file so that it doesn’t interfere with any previously merged contribution. This involves filtering out changes to any objects other than those modified in the branch, which is actually rather difficult to do using SqlPackage (and much easier with a tool like SQL Compare).
Ideally, the team needs a way to review the branch changes, pre merge, for possible conflicts with changes made to the parent branch in the meantime. This is not straightforward in the SSDT model, because it isn’t easy to check directly what is and isn’t in a DACPAC. However, the SSDT DAC tools do allow the team to create a snapshot as a set of object-level scripts. If you have these saved for every new version, it makes it much easier to see what changes were made to the development or production server.
After all the individual ‘offline’ work has been developed, tested, debugged, and merged into the development server, the work can be handed off to authorized personnel to be released to a production environment. If the team don’t have direct access to the production database, they simply hand off a DACPAC of the release candidate, which the deployment team can use to generate an “incremental update” script.
SSDT uses a ‘state-based’ approach to deploying database changes. It compares the ‘states’ of two databases, a source and a target, and generates a script to make the target schema the same as the source. The aim of the script is to upgrade a production database system, in situ, in a way that guarantees to preserve the data and doesn’t disrupt the service.
It’s a simple technique that works well, most of the time. However, because we are autogenerating the script at ‘deployment time’, any problems spotted at this late stage tend to delay deployments. Also, tricky schema changes, such as those that affect existing data, often rely on pre- and post-deployment scripts. It can be very hard to ensure that this sort of partly customized, partly auto-generated deployment is always safe, even if accidentally run on the wrong version of the target database.
If the development team do have access to the production database they can create the incremental update file themselves and test it. If successful, what is essentially a migration file can be sent to staging rather than a DACPAC.
Incorporating Flyway into SSDT development
In an SSDT-Flyway ‘hybrid’ system, the individual development of the database code continues without disruption, as described above. The team can carry on using Visual Studio and all the DACPAC tools. However, at the point where a new version of the database is required, whether to merge branch changes to create a new development version or to deploy a new version to test, QA or production, we use Flyway.
Flyway’s model isn’t really that different, except that all database changes are now scripted, checked, and versioned. If we have a script specifically designed to upgrade a target database, at a specific version, to the new version then deployments become more controlled, and merges become easier too.
From a DACPAC, you would always generate a script rather than just publish changes via the UI, and then, when it’s reviewed, you’d add that to the migration chain so that process can be repeated to provision another database. The DACPAC can still be used to represent a database during development work, and it can come from a developer working on a branch, or it can be extracted from each new version of the development database, as it is created by Flyway, to keep a record of its’ state at that version.
In the following diagram, the development database is now managed by Flyway. We use a Flyway versioned migration script to create each new version of the database, and ideally an undo script to retreat to the previous version. Each time Flyway successfully creates a new development version, we extract from it a versioned DACPAC that a developer can use subsequently to quickly recreate that version.
Flyway-SSDT: branch development and merging
To start work, developers can use a DACPAC to quickly create a local database at any version. Its advantage over a build script is that it allows for automatic insertion of test data. I provide a PowerShell Script block task (
$ExtractFromSQLServerIfNecessary) that will create these versioned DACPACs for you, from the latest Flyway version. See Flyway and SSDT: Extracting a DACPAC from a Flyway-managed Database.
With Flyway, you merge your work by adding the branch’s migration files to the migration files for the development database. You will still need to first edit the file to deal with any conflicts introduced by a different branch, merged before yours, making a conflicting change to an object, such as a table, that your branch depends on.
Using Flyway versioned migrations, we can make this merge process much less error-prone and disruptive, as described in Flyway and SSDT: Merging Work from a Branch.
Flyway-SSDT: autogenerating Flyway-compatible migrations
Flyway requires that each file has a database version, and it creates the database by applying each file in version order. It can, of course, go backwards to a previous version as well, or start with a file that takes the database from blank to the first version that is still of interest.
In our hybrid system, we can use the Script action of SqlPackage to create an incremental update script, which we’ll then need to convert into a Flyway compatible migration script. For example, we’ll need to exclude the Flyway schema history table by hand, remove any database-creation or schema-creation statements, remove transactions, and you’ll have to convert the SQLCMD file to a Flyway file (the placeholders are different). Finally, we rename it to add the version that the migration file will create if applied at the end of the existing chain of migration files. All these details are also described in described in Flyway and SSDT: Merging Work from a Branch.
The UNDO files can be generated simply by comparing a version with the previous state recorded either by a DACPAC or an object-level source. In practice however, I found that SqlPackage could not reliably proceed the undo files, for some reason. By contract, SQL Compare can so this very simply, if we’ve saved the object-level directories for every version (SQL Compare is included in the Enterprise edition of Flyway).
Do we still need DACPACs?
At this point, one might wonder why, or perhaps when, it might be possible to jettison DACPACs as part of the adoption of Flyway. DACPACs could never become a cross-RDBMS standard for representing a database because the format can only be used for SQL Server databases. Any other RDBMS will require a different way of representing the metadata of a database. Therefore, if you need to support multiple RDBMs, then you might well consider moving to a tool like Flyway for both development and deployment.
However, DACDPACs can remain a useful adjunct to SQL Server database development. Most DACPAC operations such as Extract, Export, Register and Upgrade, can also be done in SQL Server Management Studio as well.
DACPACs represent the structure of a database, and it is possible to create a database from them or generate them from a database. Both DACPACs and BACPACs are actually just renamed zip folders. They represent a database. They unpack into several XML documents that represent the details of the origin, the objects in the database, and possibly also data. The BACPAC is used mainly to move or copy reasonably small databases between servers but can also be used to archive a small database.
DACPACs are easily converted to other formats as part of the SSDT tooling. For example, the contents of a DACPAC can be, and should be, extracted as an object-level directory of build scripts, which is nowadays a standard format. When the DACPAC of a source database is compared via an SSDT tool with a target database or its DACPAC, it can produce a migration file that would change the target database from its version to the version represented by the source.
It is possible to place a database version in a DACPAC if it is just two integers,
DacMinorVersion) but you cannot store a standard semantic version. It is, however, possible to get around this by storing these as strings in the
DacApplicationDescription. However, the version is never stored in the live database but in one of the server’s system databases.
DACPACs can store data (as .BCP files) as well as metadata but don’t have the means of synchronizing data as it does with metadata. This must be done by hand-cut pre-and post-synchronization scripts.
Challenges of working with DACPACs in a SSDT-Flyway hybrid system
DACPACs are not always the ideal choice. For me, the worst problem, and one that has caused me some embarrassment in the past, is that you cannot do a quick inspection of the database from its source. It is a clever package system that provides a fairly impenetrable artefact. This is useful as a guard against anyone trying to make ‘uncontrolled’ changes to the database, but when I’m in a development team, I like to see what is going on.
This means that DACPACs are less useful in a development environment where the work is structured around distributed source control methods such as GitHub. How, for example, can you investigate changes to database objects by inspecting their creation scripts when DACPACS don’t even have creation scripts?
The potential problem of schemas
Flyway takes a cross-schema perspective. It specifies the database in terms of a list of schemas, the first in the list being the default. SSDT works on an entire database, and you can’t exclude schemas from a DACPAC using its tools. Whereas Flyway can allow teams to work independently of each other in separate schemas (in MySQL and Oracle, a database and a schema are synonymous, and SQLite doesn’t have schemas). As DACPACs cannot work at the schema level this doesn’t cause any problems in migrating from SSDT to Flyway, it would prove awkward for a migration from Flyway to DACPAC if the team has decided to partition a database into one or more projects.
The creation of Schemas is Flyway’s job. There must be no
SCHEMA statements in a Flyway Migration script, nor should they ever be deleted within a script. Flyway maintains schemas from a comma-delimited list in a .conf file.
The problem of the Flyway Schema Table
Flyway stamps a database with a version, records the history of previous migrations, and maintains the integrity of the migration files (you cannot change them) by means of the checksums stored in the table. You’ll realize immediately that these tables shouldn’t be in the DACPAC or scripts because they could cause big problems. If you imported data into one, it would corrupt the version. Imagine what would happen if you submitted a migration that changed the data in the flyway schema history table. To make this more complicated, the developer can change the name of the table or the schema it is in.
Unfortunately, none of the SSDT tools allow you to exclude either tables or schemas from comparisons, extracts or imports. To generate a migration script for example, the Flyway Schema History table or its data will have to be located and removed, whatever its name or location.
Removing SQLCMD transaction code
As well as the problem of having no way of excluding the Flyway Schema History table from a comparison, and having to remove SQLCMD placeholders and their definitions, the Flyway migration file should be free of transactions because the entire script is run within a transaction. This is done to allow Flyway to cleanly run a migration chain up to the point where the migration to the next version causes an error. Nested transactions are ignored by the SQL Server Database Engine. The outer transaction is either committed or rolled back based on the action taken at the end of this transaction. If it is committed, the inner nested transactions are also committed.
Data in migrations
Data can be placed in a DACPAC, stored as BCP (bulk-copy) files, but there is no documented way of getting it out using an external script. Anything more complicated than merely filling the tables, is done by pre- and post- migration scripts stored within the DACPAC. The usual way of getting around this is to extract the files and use the BCP.EXE utility, but the simplest task of stocking all the tables with data is usually done for you by the tool that is publishing the DACPAC. If your requirements are more complicated, it is usual to write SQLCMD SQL scripts that are then placed within the DACPAC. The publishing process will run them before or after the synchronization process to import data from the data files within the DACPAC.
Data can be exported to a BACPAC, and this can be used to copy a database, both the data and the metadata.
The transition from SSDT to Flyway is reasonably uncomplicated, and while DACPACs are a part of your database development workflows, there are plenty of opportunities to make use of the advantages of each system, to the point where developers can just carry on in isolation using SSDT or SSMS and delivering their work via DACPACs.
I have to admit that I don’t like the task of getting around the restrictions that I’ve outlined in preparing migration scripts from DACPACs: it is difficult to find ways of automating it. In the next article, I provide an Export-ChangeScriptFromDACPACS PowerShell task to help with his conversion task. However, it is much simpler to ‘extract’ the object-level scripts from the DACPAC of each version and then use SQL Compare to generate the migration, because I know that I can provide settings that produce an error-free migration file, both forward and undo.
DACPACs are very convenient as a way of preserving the state of every significant version of a database, where you use it ‘by the book’. However, because I need to use several RDBMSs, I generally have to use the scripts, data-representations and packages that are used in common between the various database systems when using Flyway. DACPACs are for SQL server only, which doesn’t help if you have to publish on a different RDBMS.
However, for a SQL Server shop, it is tempting to use all the available tools. I’ll be going into the details of integrating DACPACs into your Flyway workflow via PowerShell in the next articles on DACPACs:
- Flyway and SSDT: Extracting a DACPAC from a Flyway-managed Database
- Flyway and SSDT: Merging Work from a Branch
Was this article helpful?
Tools in this post
DevOps for the Database