Resolving Cross Database Dependencies in SQL Change Automation using Local Databases or Clones
How to avoid database build failures, caused by circular cross-database object references, by restoring or provisioning copies of all required databases to the development, build, or test SQL Server instance.
When automating a database build, mutual dependencies with other databases can cause problems. While some types of object, such as stored procedures, can be created without the dependencies, because of deferred name resolution, others can’t. For objects such as views, for example, SQL Server checks that any dependent objects exist at the point that it executes the
ALTER statement. If they don’t, the build will fail with an error. When the dependencies are circular (or mutual), we can’t get around the problem simply by changing the order in which we build the databases and their objects.
This article shows how to resolve these dependency problems by ensuring that the build, development or test server, hosts copies of all the databases in the dependency chain. This can be done by using a database restore, or by using SQL Clone. With these copies in place, SQL Change Automation can then run automated, validated builds for any individual database in the chain. Dependent objects outside the database will now be present in your development and test environments in a way that queries against those objects will return valid data, which means you can go on to run all the integration and performance tests that are required.
An alternative approach for dealing with cross-database dependencies uses SCA pre-deployment scripts to create ‘stubs’ of the referenced objects if they don’t already exist (or SQL Server synonyms to reference cross-server dependencies), before the actual build process begins.
How mutual database dependencies can cause build errors
It is common for application developers to write queries that cross the boundary between databases. This means that a database often has external dependencies. For example, a view in database CD01 references objects in database CD02. Sometimes the dependencies become mutual, or circular, so that objects in database CD02 also refer to objects in database CD01.
In this case, we have the chicken and egg problem. We can’t build CD01 unless CD02 is in place, but we can’t build CD02 unless CD01 is in place.
In a simple case such as this, you could use an SCA pre-deployment script for database CD01, which creates a ‘stub’ of the
BuildProblem view, if the table it references doesn’t yet exist. When the build phase ‘proper’ runs, SCA will then create fully all the objects in each of the databases. However, this stub technique becomes more onerous to develop and maintain as the number and complexity of mutual cross-database references increases and so might not be appropriate in some cases, such as to test one-off changes, or even possible, such as where some of the databases are third-party and you don’t have the source, or even the interface definition.
A common solution to this problem, in these cases, is to restore or provision copies of all required databases to the development, build, or test instance, and automate operations to keep the schema of these databases current in a DevOps release pipeline. This solution is also ideal for situations where you are required to develop or test against masked versions of the production data.
Option 1: Create local database copies by restoring backups
One option to create the local database copies is to use a development server where you already have copies of all databases involved in the dependency chain, or to restore a backup of a production or development database. The restore may be performed either once or on a regular basis. SQL Server doesn’t check dependencies during restores, so you can restore databases involved in the dependency change in any order. This approach may take up time and disk space, depending on the size of the databases you wish to use for development and testing purposes. You may also need to sanitize the data before distributing the ‘clean’ backups, in order to comply with regulations like GDPR or HIPAA.
If we now need to build a new version of database
CD02, any referenced objects in
CD03 will exist and so SQL Server will not raise any errors. For build and test environments, you may wish to restore these database copies rarely, and use automation to keep their schemas current. For databases managed under source control, you may add a step to your DevOps pipeline to automatically update the schemas of these databases using a release artifact in SQL Change Automation. For databases which are not under source control, you can automate schema comparison and deployments by using the SQL Compare command line.
Option 2: Fast database cloning using SQL Provision
SQL Provision offers you additional ways of working when you need to create copies of databases containing dependent objects, because it makes it so quick and easy to create or reset a clone, which requires minimal disk space. Like native database restores, deploying a clone does not validate any dependencies.
With SQL Provision you can create masked images of a database, a process that you can automate, every night for example. You can then create and reset database clones in seconds, on the build server or anywhere else that sanitized copies of the databases are needed, such as development or test instances. This can be conveniently done with SQL Provision extensions for Azure DevOps, or with PowerShell. Creating clones takes a matter of seconds, even for large databases, which enables you to clone “on the fly” in a build or deployment process.
Customizing your database provisioning approach
Whether you create the local copies by performing restores or creating clones, you may wish to customize your approach to automation depending on the task at hand.
Commonly, having created a new build, we’ll want to run a series of automated unit tests. These do not usually require production-like data: you may wish to use a small synthetic dataset, or a limited, sanitized version of the production dataset in the databases on the build instance. This can result in faster runtime for unit tests.
For running performance tests, for example, on a set of dependent databases, you’ll want to work with data with the same characteristics and volume as production.
Schema changes to databases you have restored to satisfy dependencies may introduce empty columns or tables, so queries which reference these may incorrectly return zero rows and have misleading performance. Therefore, you may need to refresh these databases, periodically. If you’re using SQL Provision, refreshing all clones in the dependency chain with a safe, production-like dataset in very fast. Otherwise, you’ll need to periodically automate restoring these databases
Development teams often wish to initiate environment refreshes on their own schedule. If you’re doing so by restoring backups of the dependent database, then you can create a PowerShell script that restores all the backups from a fileshare, with a single execution. You may also find SQL Backup’s ability to back up and restore multiple databases in a single command line operation to be useful to simplify the scripts. If you’re using SQL Provision, you can do the same thing using SQL Clone’s PowerShell cmdlets
If you are a SQL Source Control user, you may additionally script associating the correct clones or restored databases with SQL Source Control. This is particularly helpful for new developers when joining a team.
Build validation using SQL Change Automation
If we ensure all referenced objects exist, we’ll prevent those SQL Server compilation errors in our builds, but this alone doesn’t guarantee that the build will succeed. When you build your database code with a SQL Automation extension, behind the scenes the extension uses the Invoke-DatabaseBuild,
Export-DatabaseBuildArtifacts PowerShell cmdlets. You may also configure your build directly in PowerShell, if you prefer. The
Invoke-DatabaseBuild cmdlet performs a build validation phase that checks that the source can be used to successfully build a database, before creating the build artifact.
When using either the extensions or cmdlets, we supply SQL Change Automation with the details of a SQL Server instance
(-TemporaryDatabaseServer) to validate the build. If you don’t specify an instance, SQL Change Automation will look for an available
LocalDB instance. When you have a group of dependent databases, this build server will also need to host copies of all the databases, as discussed previously.
When we then run a build for an individual database, such as
CD02, the way it works depends on whether we allow SCA to validate the build by creating a new temporary version of the database, on the specified temporary instance (the ‘default’ way to do it), or we specify an existing database (or clone) that it should use instead.
Build validation and dynamic names
If we simply specify the SQL Server instance that SCA should use for build validation, and nothing more, then SCA will create on that instance a temporary database with a dynamically generated name. This pattern works well with local database copies which have been previously restored, or with clones deployed on the fly (options 1 or 2, above).
It will then attempt to run the build, in other words to create all the database objects, as specified by the source. If SCA can’t build the database, it reports the reason. If the build succeeds, SCA creates the build artifact (which we can then use for testing), and then drops the temporary database.
The big advantage of using dynamically generated names for the temporary database is that we can validate that each individual database builds without touching any of the original databases in the dependency chain. This also means that we can run multiple simultaneous builds, which would be especially useful in cases where we simply want to verify all the builds before setting in motion an overnight process.
Here’s an example taken when working with the SQL Change Automation extension for Azure DevOps (it’s very similar when working with SQL Change Automation in other orchestration software). This is the build step where we specify the temporary instance to use for build validation; notice that we leave the Database field blank.
If you’re working directly with the Invoke-DatabaseBuild cmdlet, in PowerShell, then simply don’t specify a name for the temporary database (i.e. don’t specify the
-TemporaryDatabase parameter). Example 3 on the previously-linked documentation page illustrates this, or Phil Factor’s Product Learning article, SQL Change Automation with PowerShell Scripts: getting up-and-running, gives a fully-worked example.
To incorporate unit testing into this process, using SQL Change Automation, you can use the
Invoke-DatabaseTests cmdlet, passing in the build artifact for the
-InputObject parameter, and run your tests against a dynamically named database. The
-SQLDataGeneratorProject parameters allows you to pass in a dataset to the dynamically built database.
The main issue to be aware of when using dynamic names for the temporary database is that you should validate that none of your programmable objects, such as stored procedures, views, and functions use three-part names (
CD02.myschema.myobject) for objects within the same database. In this case, where a copy of the database with the original name (
CD02) already exists on the build server, these 3-part names may cause the build validation to fail or succeed due to a false-positive result. As a best practice, naming this way should be avoided. For example, it is common for developers to sometimes need to host multiple copies of a database on a single instance, perhaps to associate one copy with a feature branch and another copy with master. Three-part names are problematic for that scenario, as well.
Build validation using a named database
If, instead of using a dynamically named temporary database for build validation, you name a specific database that SQL Change Automation should use, then it works slightly differently. Now, the SCA build expects to find the database on the instance and will not create it. Instead, it will validate the build by dropping all the objects inside and rebuilding them. This pattern works best if you’re using SQL Provision.
When using this method, you will fill in the database name in your orchestration software or PowerShell script. In this case, I’ve specified that build validation should use the existing
CD02 database on the build server (created by restoring a backup, or creating a clone)
One advantage to this is that it will work even if you use three-part names. It’s also the only way you can do it if you don’t have permission to create new databases on the build server. The potential downsides are firstly that you can now only run one build validation at a time, and secondly that if the build fails then your original
CD02 database is no longer in a good state for further validations. You will need to reset it. You can mitigate this second downside easily with SQL Provisions clone extensions or PowerShell cmdlets, which allow you to clone or reset a database quickly.
Configuring builds for databases with complex interdependencies requires a bit of planning, but automation tools enable you to tackle even circular dependencies with SQL Change Automation. You may choose to solve this problem by using a native SQL Server backup and restore process to put dependencies in place prior to the build, then keep the schema of those databases up to date automatically using an automated DevOps pipeline, or by automating SQL Compare from the command line. If you would like a more flexible option, SQL Provision gives you additional ways of working by empowering you to create database clones on the fly to satisfy external dependencies for your databases at any point in your software development lifecycle.
Was this article helpful?
Tools in this post
Provision compliant copies of production, for SQL Server DevOps test data management