Deploying Multiple Databases from Source Control using SQL Change Automation
How to automatically build multiple test databases, from source control, and then fill them with standard test data sets, using SQL Change Automation, BCP and some PowerShell magic.
Quite often, in a database development project, you need to create several copies of the database under development. They need to be to the current version of the build, or a previous specific version. You need to fill them with a version of the development data that is anonymized. For regression or integration testing, you often need a standard unchanging version of the data against which you can test the results of a process. We’ll use SQL Change Automation PowerShell cmdlets to do this.
I’ve described how to set up SCA in a previous article, and demonstrated a simple process building a version of a database, publish it as a NuGet Package, and use it to synchronize a target database. It is best to refer to that article for instructions on setting things up.
Here, we’ll get slightly more ambitious, and show how to maintain a group, or cell, of databases. As this is quite a long article, we’ll deal with topics such as how you anonymize the data, separately, in a later article. We will also duck any questions about specific source control by using a code directory.
Parallel database testing
Let’s take a simple, imaginary example. The lead developer of AdventureWorks wants to be able to implement a test cell that will allow the team to run not only the current version of AdventureWorks through the many batches of integration and regression tests, but also one of the previous versions. Tests are being added, in the light of bugs that have appeared, and the developers would like to know when a bug was introduced, and what changed at that point.
The team need to develop using a standard, anonymized version of the actual database data, which is kept in a database on a server. This database has the same table structure as the current build. The databases must end up stocked with data, and ready for work. If the build fails, the process must stop at that point, and everything must be logged, so that the process can run overnight.
The team decide to implement a PowerShell-scripted process that will take the build from source control, and use it to update a list of databases, on various servers, to be at the required version, and then stock each with the standard test data set. The solution will, ultimately, allow them to keep an archive of the build components for each version, as a zipped NuGet package, in a configuration management archive, so that they can build any previous version from its NuGet package. Obviously, all this must be a relatively quick and easy task.
How the script works
SCA PowerShell components coexist easily with the standard PowerShell components. I use SQL Server Management Objects (SMO), as exposed within the sqlserver PowerShell module, to do all the routine stuff that it is so good for, and use the SCA Cmdlets for the specialized things.
Together, these two modules provide a comprehensive toolkit. Much of this script is similar to the version in my previous article, and so I will not repeat here the descriptions of the various processes, such as checking for connections and valid paths, validating the build, and so on.
Validation
To check or validate a build, SCA takes the source code, uses it to builds a temporary database, on a server, runs the checks, and then deletes it. SCA is relaxed about the form of the source code. You can build from a variety of sources such as a single build script file or a source code directory
For simple databases, you can even use LocalDB for the temporary database, but don’t count on it. The Server needs to be on the same version, or a higher version, of SQL Server and have all the extra components, such as Full-Text Search (FreeText), that are used in the database. This temporary database needs care but once it is set up for the project, it looks after itself.
Since we’re building a database that is tested, and then disposed of, we don’t need the complication of building within a simple transaction that can be rolled-back on error. If your database uses Full-Text Search, it can’t be done within a single transaction anyway.
Kill ‘n’ Fill
The only good reason for doing a migration of a database is to preserve the existing data. Test databases are unlikely to be able to preserve the correct data for long, by the very nature of the tests, and the same is usually true of development databases. Unit tests, and any other types of test of a complete process, are likely to change the data. For this exercise, we’ll use a different approach, which is to delete the existing database along with its data, and create it all anew. You can script a backup first in PowerShell, but I haven’t shown that here (see instead How to Automatically Create and Refresh Development and Test Databases using SQL Clone and SQL Toolbelt).
So, the PowerShell routine will kill the destination databases before building the new databases and then filling them with data. The best and fastest way of copying the data is to use native-mode BCP. You use BCP both for the copying of data into a file directory, and for inserting it into a newly-built database. Each table resides in its own file. For this demonstration, we will keep the data in a file directory rather than attempt to store the data in the NuGet package, because of the distraction and overhead of zipping and unzipping.
If you have anonymized the original data in-place, then the kill-And-fill technique will make it far less likely that any remains of the original data will left after it has been copied across to the new databases. Conversely, if you anonymize the data and then copy the databases across to each destination, either using backup-and-restore, or by detaching, copying the MDF/LDF files and attaching, then both the database transaction log and the data pages are likely to retain vestiges of the data you were trying to mask.
A complication of using BCP native-mode imports is that they leave database constraints disabled. They will all need to be re-enabled, once the data imports are finished. This can introduce a subtle bug, if you had disabled one or more of those constraints, deliberately. This requires a SQL script because the SMO (sqlserver) way of doing it is clunky. On the plus side, this script should serve as an illustration of how to do SQL calls within PowerShell, when using SMO.
Each of the destination servers must have the correct version of SQL Server, as well as all the additional features required for the database to work. A build is rather more sensitive to this than a restore or an attach. AdventureWorks can easily be restored onto a server, even if full-text search isn’t installed, but if you try to build it from the source code, it will cause an error.
The PowerShell script
Once again, I’ve provided the configuration data in the body of this script. This isn’t done because I think it is a good idea, but to keep things simple. It is best kept separately, so the same process can be used in several different contexts. Otherwise, any change in a path, database name, or the like, would require a change in version in your source control system. Also, I don’t even like keeping usernames in PowerShell scripts. In my previous article, I showed how to read the information into the $config PowerShell variable, as a hashtable, and how to ensure that the data was all there.
The script starts by reading the configuration data into a hashtable. Once it has initialized everything, and made sure that all the file paths exist, it creates the serverConnection objects for each server in the list, and adds them to the hashtable. This provides all the login information for the various cmdlets, SCA, SMO and BCP. With this login information, we can then kill any existing versions of the target databases and create new empty databases. We go through all the databases, check that we can connect to them via SCA, and store the SCA connection information.
We now read the source control directory. We use SCA to build the temporary database on a server and use it to validate the build. We create a NuGet file and store the validated project object for the build stage.
Before we build and stock all the databases, we must export all the data from the data source. In reality, this would contain your masked and anonymized data, but here we just use the data from the live AdventureWorks database, the source of which we already have in source control. We use BCP to export the data from each table in native format into files in a file directory, one for each table.
We use the validated project object to synchronize the database schema with each empty database that we have created. This is, in effect, a clean build. We then fill each database table with the data for it, and enable constraints.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 |
set-psdebug -strict $ErrorActionPreference = "stop" # <#variables that you need to fill in for each project. Normally I have this as a separate file and read the relevant data file in according to the project being built The script adds to the config object as it is executed #> $config = @{ # The SQL Change Automation project (in source control directory) to validate, test and sync eg; D:\MyDatabaseproject\project. 'Project' = 'MyPathToTheProject'; # The directory to store the NuGet Package in. eg; D:\MyDatabaseproject\artefacts 'BuildArtifact' = 'MyPathToTheArtifacts; # The directory to store the log in. <a id="post-464798-_Hlk518488650"></a>eg; D:\MyDatabaseproject\Logs 'LogDirectory' = ' MyPathToTheLogFile'; # the directory where the data is kept 'DataDirectory' = ' MyPathToTheData'; eg; D:\MyDatabaseproject\Data 'PackageId' = 'AdventureWorks'; # the version in the Nuget Package 'PackageVersion' = '1.0.1'; 'Databases' = @{ 'Temporary' = #the temporary database. Use a blank string for the user for windows auth. @{ 'ServerInstance' = ' Philf00'; 'Database' = 'master'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions'; }; 'DataSource' = #the database with the current data. @{ 'ServerInstance' = 'Philf01'; 'Database' = 'AdventureWorks2012'; 'Username' = 'PhilFactor'; 'SQLCompareOptions' = 'NoTransactions'; }; 'Antipas' = #a target database. Use a blank string for the user for windows auth. @{ 'ServerInstance' = ' Philf02'; 'Database' = 'Antipas'; 'Username' = 'PhilFactor'; 'SQLCompareOptions' = 'NoTransactions'; } 'Phasael' = #a target database. Use a blank string for the user for windows auth. @{ 'ServerInstance' = ' Philf03'; 'Database' = 'Phasael'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions'; } 'Archaelus' = #a target database. With windows auth, use a blank string for the user. @{ 'ServerInstance' = ' Philf04'; 'Database' = 'Archaelus'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions'; } } } # and some handy constants $MS = 'Microsoft.SQLServer' $My = "$MS.Management.Smo" $errors = 0 #keep a count of the errors we encounter $TheLogFile = "$($config.LogDirectory)\logfile.txt" <# Trap { # Handle the exception $err = $_.Exception $MyErrorMessage = $err.Message while ($err.InnerException) { $MyErrorMessage += ', and '+ $err.InnerException }; "$((Get-Date).ToString()) - Unhandled error '$MyErrorMessage'">>$TheLogFile; # End the script. break } #> #Load SMO assemblies $ImportErrors = @() Import-Module SqlChangeAutomation -ErrorAction silentlycontinue -ErrorVariable +ImportErrors Import-Module sqlserver -DisableNameChecking -ErrorAction silentlycontinue -ErrorVariable +ImportErrors if ($ImportErrors.count -gt 0) #if we couldn't import something { $ImportErrors | foreach{ "The $($config.'PackageId') build gave an error '$($_)'" "$((Get-Date).ToString()) - $_">>$TheLogFile; write-warning $_; #$errors++ #keep a count of the errors } } #check and if necessary create all directories specified by the config @("$($config.LogDirectory)", "$($config.DataDirectory)", "$($config.BuildArtifact)") | foreach{ # If necessary, create the directory for the artefact if (-not (Test-Path -PathType Container $_)) { # we create the directory if it doesn't already exist New-Item -ItemType Directory -Force -Path $_; } } @("$($config.Project)") | Foreach{ if (-not (Test-Path -PathType Container $_)) { $TheError = "The project file directory for $($config.'PackageId'),'$($config.Project)' isn't there" "$((Get-Date).ToString()) - $TheError">>$TheLogFile; Write-error $TheError; $errors++; #keep a count of the errors } } <# We'll start by checking that all the servers can be reached with the connection information. we can quickly ascertain that all the server connection information is OK We can make the connection checks, and abort on failure before wasting time starting the process.#> $config.Databases.GetEnumerator() | foreach{ $ConnectionErrors = @() # to store any connection errors in $Database = $_ $db = $Database.Value; $conn = new-object "$MS.Management.Common.ServerConnection" $conn.ServerInstance = $db.ServerInstance if ($db.username -ieq '') { # Crikey, this is easy, windows Passwords. Dont you love 'em? $conn.LoginSecure = $true; } else { <# This is more elaborate a process than you might expect because we can't assume that we can use Windows authentication, because of Azure, remote servers outside the domain, and other such complications. We can't ever keep passwords for SQL Server authentication as part of the static script data. At this stage, we ask for passwords if they aren't known, and otherwise store them as secure strings on file in the user area, protected by the workstation security. #> #create a connection object to manage credentials $conn = new-object "$MS.Management.Common.ServerConnection" $conn.ServerInstance = $db.ServerInstance $encryptedPasswordFile = "$env:USERPROFILE\$($db.Username)-$($db.ServerInstance).txt" # test to see if we know about the password un a secure string stored in the user area if (Test-Path -path $encryptedPasswordFile -PathType leaf) { #has already got this set for this login so fetch it $encrypted = Get-Content $encryptedPasswordFile | ConvertTo-SecureString $Credentials = New-Object System.Management.Automation.PsCredential($db.Username, $encrypted) } else #then we have to ask the user for it { #hasn't got this set for this login $Credentials = get-credential -Credential $SourceLogin $Credentials.Password | ConvertFrom-SecureString | Set-Content "$env:USERPROFILE\$SourceLogin-$SourceServerName.txt" } $conn.LoginSecure = $false; $conn.Login = $Credentials.UserName; $conn.SecurePassword = $Credentials.Password; } $db.ServerConnection = $conn; } <# We Kill any existing versions of the databases and create the new blank databases. Be very careful not to kill our Datasource! Normally, you'd back up existing versions of databases just in case #> if ($errors -eq 0) { $config.Databases.GetEnumerator() | where Name -inotin ('Temporary', 'DataSource') | foreach{ $DestinationServerObject = new-object ("$My.Server") $_.Value.ServerConnection if ($DestinationServerObject.Version -eq $null) { $TheError = "Could not connect to the server $($_.Value.ServerInstance) with the credentials" write-warning $TheError "$((Get-Date).ToString()) - $TheError ">>$TheLogFile; $errors++ } else { #if the database already exists, then kill it If (@($DestinationServerObject.Databases | % { $_.Name }) -contains $_.Value.Database) { $DestinationServerObject.KillDatabase($_.Value.Database) } $DestinationDatabaseObject = New-Object ("$My.Database") ($DestinationServerObject, $_.Value.Database) $DestinationDatabaseObject.Create() if ($DestinationDatabaseObject.name -ne $_.Value.Database) { $TheError = "Can't create the database '$($_.Value.Database)' in '$($_.Value.ServerInstance)" write-warning $TheError "$((Get-Date).ToString()) - $TheError ">>$TheLogFile; $errors++ }; } } } <#Now that we know the destinations exist as empty databases, we can test the connections for each to be certain we can connect to them via SCA and store the connection information #> if ($errors -eq 0) { $config.Databases.GetEnumerator() | foreach{ $db = $_.Value #We get the password on the fly for the Cmdlet that creates the SCA connection object for each database if ($db.ServerConnection.Login -eq '') { $db.Connection = New-DatabaseConnection -ServerInstance $db.ServerInstance -Database $db.Database } else { $db.Connection = New-DatabaseConnection ` -ServerInstance $db.ServerInstance -Database $db.Database ` -Username $db.ServerConnection.Login -Password $db.ServerConnection.Password; } #now we can test that the credentials get to the server $TestResult = Test-DatabaseConnection $db.Connection ` -ErrorAction silentlycontinue -ErrorVariable ConnectionErrors if ($ConnectionErrors.count -gt 0) #if we couldn't connect to something { write-warning "$($Database.Name;) gave an error $($connectionErrors[0])"; "$((Get-Date).ToString()) - $($Database.Name;) of $( $config.'PackageId') gave an error $($connectionErrors[0])">>$TheLogFile; $errors++ } } } <#we now read the source control directory, and use the temporary server to build the database so as to validate it. #> if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors. { # we now check whether we can buld this without errors $tempServerConnectionString = $config.Databases.Temporary.Connection.ConnectionString.UnmaskedValue # Validate the SQL Change Automation project and import it inot a ScriptsFolder object try { $validatedProject = Invoke-DatabaseBuild $config.Project ` -TemporaryDatabaseServer $tempServerConnectionString ` -SQLCompareOptions $config.Databases.Temporary.SQLCompareOptions 3>>$TheLogFile } catch #could not get the -ErrorAction silentlycontinue -ErrorVariable ConnectionErrors to work { $_.Exception.Message "$($Database.Name;) of of $($config.'PackageId') couldn't be validated because $( $_.Exception.Message)" | Foreach{ write-warning $_ "$((Get-Date).ToString()) - $_">>$TheLogFile; $errors++; } } } <# now we can get all the data from the masked database. we make an SMO connection, check that it is good, and then iterate through the tables, using BCP to export the data from each table in native format. Each file is named after the table name and schema. This isn't entirely foolproof if you are using non-standard table or schema names. #> if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors. { $TheDataSource = $config.Databases.DataSource; $DataSourceInstanceObject = new-object ("$My.Server") $TheDataSource.ServerConnection; if ($DataSourceInstanceObject.Version -eq $null) { Throw "Can't find the instance $($TheDataSource.ServerInstance)" }; $DataDatabaseObject = $DataSourceInstanceObject.Databases[$TheDataSource.Database] if ($DataDatabaseObject.name -ne $TheDataSource.Database) { Throw "Can't find the database '$($TheDataSource.Database)' ($( $DataDatabaseObject.name)) in instance $($TheDataSource.ServerInstance)" }; $DataDatabaseObject.Tables | foreach { $filename = "$($_.schema)_$($_.Name)" -replace '[\\\/\:\.]', '-' $TheScriptPath = "$($config.DataDirectory)\$($filename).bcp" if (-not ($TheDataSource.ServerConnection.LoginSecure)) { $WhatHappened = bcp "$($_.Schema).$($_.Name)" out "`"$TheScriptPath`"" ` -n -N "-d$($TheDataSource.Database)" "-U$($TheDataSource.Username)" ` "-P$($TheDataSource.ServerConnection.Password)" ` "-S$($TheDataSource.ServerInstance)"; } else { $WhatHappened = bcp "$($_.Schema).$($_.Name)" out "`"$TheScriptPath`"" ` -n -N -T "-d$($TheDataSource.Database)" "-S$($TheDataSource.ServerInstance)"; } if ($WhatHappened -like '*Error*') { throw ($whatHappened); $whatHappened }; } } $SyncErrors = @() # we collect sync errors here <# we create the Nuget Package because we want a permanent record of the build. We can add other processes at the same time such as making NuGet packages, documenting the database or adding a SQL Codeguard check #> if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors. { # OK we can now build it #produces the nuget package if ($validatedProject.GetType().Name -ne 'ScriptsFolder') { $TheError = "$($config.PackageId) could not be verified." write-warning $TheError; "$((Get-Date).ToString()) - $TheError">>$TheLogFile; } else { $buildArtifact = $validatedProject | New-DatabaseBuildArtifact -PackageId $config.PackageId -PackageVersion $config.PackageVersion if ($buildArtifact.GetType().Name -ne 'SocBuildArtifact') { $TheError = "$($config.PackageId) build artefact could not be created." write-warning $TheError; "$((Get-Date).ToString()) - $TheError">>$TheLogFile; } $buildArtifact | Export-DatabaseBuildArtifact -Path "$($config.BuildArtifact)" -force } } <# Now we create the database schema in each empty target database #> if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors. { $PriorSyncErrors = $SyncErrors.count # detect an increase in sync errors $config.Databases.GetEnumerator() | where Name -notin ('Temporary', 'DataSource') | foreach{ # Sync a database $DatabaseSettings = $_.Value; $syncResult = Sync-DatabaseSchema ` -Source $validatedProject -Target $DatabaseSettings.Connection ` -AbortOnWarningLevel None -SQLCompareOptions $DatabaseSettings.SQLCompareOptions ` -ErrorAction silentlycontinue -ErrorVariable +SyncErrors 3>>$TheLogFile if ($SyncErrors.count -gt $PriorSyncErrors) #if there was a new sync error { $TheError = "$($_.Name) gave a sync error $($connectionErrors[$PriorSyncErrors])"; "$((Get-Date).ToString()) - $($config.'PackageId') $TheError">>$TheLogFile; $errors++; break; } $DataTargetObject = new-object ("$My.Server") $DatabaseSettings.ServerConnection $DestinationDatabase = $DatabaseSettings.Database $DestinationServer = $DatabaseSettings.ServerInstance if ($DataTargetObject.Version -eq $null) { Throw "Can't find the instance $($DestinationServer)" }; $TargetDatabaseObject = $DataTargetObject.Databases[$DestinationDatabase] if ($TargetDatabaseObject.name -ne $DestinationDatabase) { Throw "Can't find the database '$($DestinationDatabase) in instance $($DestinationServer)" }; <# Now we BCP all the table data in. As we are using native mode the utility disables constraints for the table before doing the import #> $TargetDatabaseObject.Tables | #for every table foreach { $filename = "$($_.schema)_$($_.Name)" -replace '[\\\/\:\.]', '-'; $TheScriptPath = "$($config.DataDirectory)\$($filename).bcp"; if (-not ($TheDataSource.ServerConnection.LoginSecure)) { $WhatHappened = BCP "`"$DestinationDatabase`".`"$($_.Schema)`".`"$($_.Name)`"" ` in "`"$TheScriptPath`"" -q -n -N -E "-U$($DatabaseSettings.Username)" ` "-P$($DatabaseSettings.ServerConnection.Password)" "-S$($DestinationServer)"; } else { $WhatHappened = BCP "`"$DestinationDatabase`".`"$($_.Schema)`".`"$($_.Name)`"" ` in "`"$TheScriptPath`"" -q -N -T -E "-S$($DestinationServer)"; } if ($WhatHappened -like '*Error*') { throw ("$whatHappened $DestinationServer $DestinationDatabase $filename") }; $result = $DataTargetObject.ConnectionContext.ExecuteNonQuery( "EXEC sp_msforeachtable 'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT all'"); "$DestinationServer $DestinationDatabase has been stocked with data" >> $TheLogFile } } } if ($errors -gt 0) { "$((Get-Date).ToString()) - the build process was aborted">>$TheLogFile; } |
Conclusions
The kill ‘n’ fill database build process, which under the covers is a synchronization with an empty database, is an approach that is fine for a relatively small-scale database, as an alternative to a migration approach, or synchronization with a full database of a different version. I use it after an anonymization process, as it leaves no trace of the original data.
For bigger databases, you will need to adopt the method of restoring a backup, or creating a VM, using Docker or using SQL Clone. There is no single ‘best’ way of copying databases. It is always good, though, to have plenty of alternative techniques to meet your requirements.
In our simple imaginary example, we have started down the road of achieving what the fictitious developers of the IT Department of AdventureWorks wanted. The also had the ideal of being able to deploy database changes contained in a NuGet package, basically to provide a test database at a previous version as well as the current one. We’ll tackle this in the next article.