Deploying Multiple Databases from Source Control using SQL Change Automation
Quite often, in a database development project, you need to create several copies of the database under development. They need to be to the current version of the build, or a previous specific version. You need to fill them with a version of the development data that is anonymized. For regression or integration testing, you often need a standard unchanging version of the data against which you can test the results of a process. We'll use SQL Change Automation PowerShell cmdlets to do this.
I've described how to set up SCA in a previous article, and demonstrated a simple process building a version of a database, publish it as a NuGet Package, and use it to synchronize a target database. It is best to refer to that article for instructions on setting things up.
Here, we'll get slightly more ambitious, and show how to maintain a group, or cell, of databases. As this is quite a long article, we'll deal with topics such as how you anonymize the data, separately, in a later article. We will also duck any questions about specific source control by using a code directory.
Parallel database testing
Let's take a simple, imaginary example. The lead developer of AdventureWorks wants to be able to implement a test cell that will allow the team to run not only the current version of AdventureWorks through the many batches of integration and regression tests, but also one of the previous versions. Tests are being added, in the light of bugs that have appeared, and the developers would like to know when a bug was introduced, and what changed at that point.
The team need to develop using a standard, anonymized version of the actual database data, which is kept in a database on a server. This database has the same table structure as the current build. The databases must end up stocked with data, and ready for work. If the build fails, the process must stop at that point, and everything must be logged, so that the process can run overnight.
The team decide to implement a PowerShell-scripted process that will take the build from source control, and use it to update a list of databases, on various servers, to be at the required version, and then stock each with the standard test data set. The solution will, ultimately, allow them to keep an archive of the build components for each version, as a zipped NuGet package, in a configuration management archive, so that they can build any previous version from its NuGet package. Obviously, all this must be a relatively quick and easy task.
How the script works
SCA PowerShell components coexist easily with the standard PowerShell components. I use SQL Server Management Objects (SMO), as exposed within the sqlserver PowerShell module, to do all the routine stuff that it is so good for, and use the SCA Cmdlets for the specialized things.
Together, these two modules provide a comprehensive toolkit. Much of this script is similar to the version in my previous article, and so I will not repeat here the descriptions of the various processes, such as checking for connections and valid paths, validating the build, and so on.
Validation
To check or validate a build, SCA takes the source code, uses it to builds a temporary database, on a server, runs the checks, and then deletes it. SCA is relaxed about the form of the source code. You can build from a variety of sources such as a single build script file or a source code directory
For simple databases, you can even use LocalDB for the temporary database, but don't count on it. The Server needs to be on the same version, or a higher version, of SQL Server and have all the extra components, such as Full-Text Search (FreeText), that are used in the database. This temporary database needs care but once it is set up for the project, it looks after itself.
Since we're building a database that is tested, and then disposed of, we don't need the complication of building within a simple transaction that can be rolled-back on error. If your database uses Full-Text Search, it can't be done within a single transaction anyway.
Kill 'n' Fill
The only good reason for doing a migration of a database is to preserve the existing data. Test databases are unlikely to be able to preserve the correct data for long, by the very nature of the tests, and the same is usually true of development databases. Unit tests, and any other types of test of a complete process, are likely to change the data. For this exercise, we'll use a different approach, which is to delete the existing database along with its data, and create it all anew. You can script a backup first in PowerShell, but I haven't shown that here (see instead How to Automatically Create and Refresh Development and Test Databases using SQL Clone and SQL Toolbelt).
So, the PowerShell routine will kill the destination databases before building the new databases and then filling them with data. The best and fastest way of copying the data is to use native-mode BCP. You use BCP both for the copying of data into a file directory, and for inserting it into a newly-built database. Each table resides in its own file. For this demonstration, we will keep the data in a file directory rather than attempt to store the data in the NuGet package, because of the distraction and overhead of zipping and unzipping.
If you have anonymized the original data in-place, then the kill-And-fill technique will make it far less likely that any remains of the original data will left after it has been copied across to the new databases. Conversely, if you anonymize the data and then copy the databases across to each destination, either using backup-and-restore, or by detaching, copying the MDF/LDF files and attaching, then both the database transaction log and the data pages are likely to retain vestiges of the data you were trying to mask.
A complication of using BCP native-mode imports is that they leave database constraints disabled. They will all need to be re-enabled, once the data imports are finished. This can introduce a subtle bug, if you had disabled one or more of those constraints, deliberately. This requires a SQL script because the SMO (sqlserver) way of doing it is clunky. On the plus side, this script should serve as an illustration of how to do SQL calls within PowerShell, when using SMO.
Each of the destination servers must have the correct version of SQL Server, as well as all the additional features required for the database to work. A build is rather more sensitive to this than a restore or an attach. AdventureWorks can easily be restored onto a server, even if full-text search isn't installed, but if you try to build it from the source code, it will cause an error.
The PowerShell script
Once again, I've provided the configuration data in the body of this script. This isn't done because I think it is a good idea, but to keep things simple. It is best kept separately, so the same process can be used in several different contexts. Otherwise, any change in a path, database name, or the like, would require a change in version in your source control system. Also, I don't even like keeping usernames in PowerShell scripts. In my previous article, I showed how to read the information into the $config PowerShell variable, as a hashtable, and how to ensure that the data was all there.
The script starts by reading the configuration data into a hashtable. Once it has initialized everything, and made sure that all the file paths exist, it creates the serverConnection objects for each server in the list, and adds them to the hashtable. This provides all the login information for the various cmdlets, SCA, SMO and BCP. With this login information, we can then kill any existing versions of the target databases and create new empty databases. We go through all the databases, check that we can connect to them via SCA, and store the SCA connection information.
We now read the source control directory. We use SCA to build the temporary database on a server and use it to validate the build. We create a NuGet file and store the validated project object for the build stage.
Before we build and stock all the databases, we must export all the data from the data source. In reality, this would contain your masked and anonymized data, but here we just use the data from the live AdventureWorks database, the source of which we already have in source control. We use BCP to export the data from each table in native format into files in a file directory, one for each table.
We use the validated project object to synchronize the database schema with each empty database that we have created. This is, in effect, a clean build. We then fill each database table with the data for it, and enable constraints.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
|
set-psdebug -strict
$ErrorActionPreference = "stop" #
<#variables that you need to fill in for each project. Normally I have this as a separate file
and read the relevant data file in according to the project being built
The script adds to the config object as it is executed #>
$config = @{
# The SQL Change Automation project (in source control directory) to validate, test and sync eg; D:\MyDatabaseproject\project.
'Project' = 'MyPathToTheProject';
# The directory to store the NuGet Package in. eg; D:\MyDatabaseproject\artefacts
'BuildArtifact' = 'MyPathToTheArtifacts;
# The directory to store the log in. <a id="post-464798-_Hlk518488650"></a>eg; D:\MyDatabaseproject\Logs
'LogDirectory' = ' MyPathToTheLogFile';
# the directory where the data is kept
'DataDirectory' = ' MyPathToTheData'; eg; D:\MyDatabaseproject\Data
'PackageId' = 'AdventureWorks';
# the version in the Nuget Package
'PackageVersion' = '1.0.1';
'Databases' = @{
'Temporary' = #the temporary database. Use a blank string for the user for windows auth.
@{
'ServerInstance' = ' Philf00'; 'Database' = 'master'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions';
};
'DataSource' = #the database with the current data.
@{
'ServerInstance' = 'Philf01'; 'Database' = 'AdventureWorks2012'; 'Username' = 'PhilFactor'; 'SQLCompareOptions' = 'NoTransactions';
};
'Antipas' = #a target database. Use a blank string for the user for windows auth.
@{
'ServerInstance' = ' Philf02'; 'Database' = 'Antipas'; 'Username' = 'PhilFactor'; 'SQLCompareOptions' = 'NoTransactions';
}
'Phasael' = #a target database. Use a blank string for the user for windows auth.
@{
'ServerInstance' = ' Philf03'; 'Database' = 'Phasael'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions';
}
'Archaelus' = #a target database. With windows auth, use a blank string for the user.
@{
'ServerInstance' = ' Philf04'; 'Database' = 'Archaelus'; 'Username' = ''; 'SQLCompareOptions' = 'NoTransactions';
}
}
}
# and some handy constants
$MS = 'Microsoft.SQLServer'
$My = "$MS.Management.Smo"
$errors = 0 #keep a count of the errors we encounter
$TheLogFile = "$($config.LogDirectory)\logfile.txt"
<#
Trap
{
# Handle the exception
$err = $_.Exception
$MyErrorMessage = $err.Message
while ($err.InnerException)
{
$MyErrorMessage += ', and '+ $err.InnerException
};
"$((Get-Date).ToString()) - Unhandled error '$MyErrorMessage'">>$TheLogFile;
# End the script.
break
} #>
#Load SMO assemblies
$ImportErrors = @()
Import-Module SqlChangeAutomation -ErrorAction silentlycontinue -ErrorVariable +ImportErrors
Import-Module sqlserver -DisableNameChecking -ErrorAction silentlycontinue -ErrorVariable +ImportErrors
if ($ImportErrors.count -gt 0) #if we couldn't import something
{
$ImportErrors | foreach{
"The $($config.'PackageId') build gave an error '$($_)'"
"$((Get-Date).ToString()) - $_">>$TheLogFile;
write-warning $_;
#$errors++ #keep a count of the errors
}
}
#check and if necessary create all directories specified by the config
@("$($config.LogDirectory)", "$($config.DataDirectory)", "$($config.BuildArtifact)") | foreach{
# If necessary, create the directory for the artefact
if (-not (Test-Path -PathType Container $_))
{
# we create the directory if it doesn't already exist
New-Item -ItemType Directory -Force -Path $_;
}
}
@("$($config.Project)") | Foreach{
if (-not (Test-Path -PathType Container $_))
{
$TheError = "The project file directory for $($config.'PackageId'),'$($config.Project)' isn't there"
"$((Get-Date).ToString()) - $TheError">>$TheLogFile;
Write-error $TheError;
$errors++; #keep a count of the errors
}
}
<#
We'll start by checking that all the servers can be reached with the connection information.
we can quickly ascertain that all the server connection information is OK
We can make the connection checks, and abort on failure before wasting time starting the process.#>
$config.Databases.GetEnumerator() | foreach{
$ConnectionErrors = @() # to store any connection errors in
$Database = $_
$db = $Database.Value;
$conn = new-object "$MS.Management.Common.ServerConnection"
$conn.ServerInstance = $db.ServerInstance
if ($db.username -ieq '')
{
# Crikey, this is easy, windows Passwords. Dont you love 'em?
$conn.LoginSecure = $true;
}
else
{
<# This is more elaborate a process than you might expect because we can't assume that we can use
Windows authentication, because of Azure, remote servers outside the domain, and other such
complications. We can't ever keep passwords for SQL Server authentication as part of the static
script data. At this stage, we ask for passwords if they aren't known, and otherwise store them
as secure strings on file in the user area, protected by the workstation security.
#>
#create a connection object to manage credentials
$conn = new-object "$MS.Management.Common.ServerConnection"
$conn.ServerInstance = $db.ServerInstance
$encryptedPasswordFile = "$env:USERPROFILE\$($db.Username)-$($db.ServerInstance).txt"
# test to see if we know about the password un a secure string stored in the user area
if (Test-Path -path $encryptedPasswordFile -PathType leaf)
{
#has already got this set for this login so fetch it
$encrypted = Get-Content $encryptedPasswordFile | ConvertTo-SecureString
$Credentials = New-Object System.Management.Automation.PsCredential($db.Username, $encrypted)
}
else #then we have to ask the user for it
{
#hasn't got this set for this login
$Credentials = get-credential -Credential $SourceLogin
$Credentials.Password | ConvertFrom-SecureString |
Set-Content "$env:USERPROFILE\$SourceLogin-$SourceServerName.txt"
}
$conn.LoginSecure = $false;
$conn.Login = $Credentials.UserName;
$conn.SecurePassword = $Credentials.Password;
}
$db.ServerConnection = $conn;
}
<# We Kill any existing versions of the databases and create the new blank databases.
Be very careful not to kill our Datasource! Normally, you'd back up existing versions
of databases just in case #>
if ($errors -eq 0)
{
$config.Databases.GetEnumerator() | where Name -inotin ('Temporary', 'DataSource') | foreach{
$DestinationServerObject = new-object ("$My.Server") $_.Value.ServerConnection
if ($DestinationServerObject.Version -eq $null)
{
$TheError = "Could not connect to the server $($_.Value.ServerInstance) with the credentials"
write-warning $TheError
"$((Get-Date).ToString()) - $TheError ">>$TheLogFile;
$errors++
}
else
{
#if the database already exists, then kill it
If (@($DestinationServerObject.Databases | % { $_.Name }) -contains $_.Value.Database)
{ $DestinationServerObject.KillDatabase($_.Value.Database) }
$DestinationDatabaseObject = New-Object ("$My.Database") ($DestinationServerObject, $_.Value.Database)
$DestinationDatabaseObject.Create()
if ($DestinationDatabaseObject.name -ne $_.Value.Database)
{
$TheError = "Can't create the database '$($_.Value.Database)' in '$($_.Value.ServerInstance)"
write-warning $TheError
"$((Get-Date).ToString()) - $TheError ">>$TheLogFile;
$errors++
};
}
}
}
<#Now that we know the destinations exist as empty databases, we can test the connections for each
to be certain we can connect to them via SCA and store the connection information #>
if ($errors -eq 0)
{
$config.Databases.GetEnumerator() | foreach{
$db = $_.Value
#We get the password on the fly for the Cmdlet that creates the SCA connection object for each database
if ($db.ServerConnection.Login -eq '')
{
$db.Connection =
New-DatabaseConnection -ServerInstance $db.ServerInstance -Database $db.Database
}
else
{
$db.Connection =
New-DatabaseConnection `
-ServerInstance $db.ServerInstance -Database $db.Database `
-Username $db.ServerConnection.Login -Password $db.ServerConnection.Password;
}
#now we can test that the credentials get to the server
$TestResult = Test-DatabaseConnection $db.Connection `
-ErrorAction silentlycontinue -ErrorVariable ConnectionErrors
if ($ConnectionErrors.count -gt 0) #if we couldn't connect to something
{
write-warning "$($Database.Name;) gave an error $($connectionErrors[0])";
"$((Get-Date).ToString()) - $($Database.Name;) of $(
$config.'PackageId') gave an error $($connectionErrors[0])">>$TheLogFile;
$errors++
}
}
}
<#we now read the source control directory, and use the temporary server to build the database so as to validate it.
#>
if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors.
{
# we now check whether we can buld this without errors
$tempServerConnectionString = $config.Databases.Temporary.Connection.ConnectionString.UnmaskedValue
# Validate the SQL Change Automation project and import it inot a ScriptsFolder object
try
{
$validatedProject =
Invoke-DatabaseBuild $config.Project `
-TemporaryDatabaseServer $tempServerConnectionString `
-SQLCompareOptions $config.Databases.Temporary.SQLCompareOptions 3>>$TheLogFile
}
catch #could not get the -ErrorAction silentlycontinue -ErrorVariable ConnectionErrors to work
{
$_.Exception.Message
"$($Database.Name;) of of $($config.'PackageId') couldn't be validated because $(
$_.Exception.Message)" | Foreach{
write-warning $_
"$((Get-Date).ToString()) - $_">>$TheLogFile;
$errors++;
}
}
}
<# now we can get all the data from the masked database. we make an SMO connection, check that
it is good, and then iterate through the tables, using BCP to export the data from each table
in native format. Each file is named after the table name and schema. This isn't entirely foolproof
if you are using non-standard table or schema names. #>
if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors.
{
$TheDataSource = $config.Databases.DataSource;
$DataSourceInstanceObject = new-object ("$My.Server") $TheDataSource.ServerConnection;
if ($DataSourceInstanceObject.Version -eq $null)
{ Throw "Can't find the instance $($TheDataSource.ServerInstance)" };
$DataDatabaseObject = $DataSourceInstanceObject.Databases[$TheDataSource.Database]
if ($DataDatabaseObject.name -ne $TheDataSource.Database)
{
Throw "Can't find the database '$($TheDataSource.Database)' ($(
$DataDatabaseObject.name)) in instance $($TheDataSource.ServerInstance)" };
$DataDatabaseObject.Tables |
foreach {
$filename = "$($_.schema)_$($_.Name)" -replace '[\\\/\:\.]', '-'
$TheScriptPath = "$($config.DataDirectory)\$($filename).bcp"
if (-not ($TheDataSource.ServerConnection.LoginSecure))
{
$WhatHappened =
bcp "$($_.Schema).$($_.Name)" out "`"$TheScriptPath`"" `
-n -N "-d$($TheDataSource.Database)" "-U$($TheDataSource.Username)" `
"-P$($TheDataSource.ServerConnection.Password)" `
"-S$($TheDataSource.ServerInstance)";
}
else
{
$WhatHappened =
bcp "$($_.Schema).$($_.Name)" out "`"$TheScriptPath`"" `
-n -N -T "-d$($TheDataSource.Database)" "-S$($TheDataSource.ServerInstance)";
}
if ($WhatHappened -like '*Error*') { throw ($whatHappened); $whatHappened };
}
}
$SyncErrors = @() # we collect sync errors here
<# we create the Nuget Package because we want a permanent record of the build. We can add
other processes at the same time such as making NuGet packages, documenting the database or
adding a SQL Codeguard check #>
if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors.
{
# OK we can now build it
#produces the nuget package
if ($validatedProject.GetType().Name -ne 'ScriptsFolder')
{
$TheError = "$($config.PackageId) could not be verified."
write-warning $TheError;
"$((Get-Date).ToString()) - $TheError">>$TheLogFile;
}
else
{
$buildArtifact = $validatedProject |
New-DatabaseBuildArtifact -PackageId $config.PackageId -PackageVersion $config.PackageVersion
if ($buildArtifact.GetType().Name -ne 'SocBuildArtifact')
{
$TheError = "$($config.PackageId) build artefact could not be created."
write-warning $TheError;
"$((Get-Date).ToString()) - $TheError">>$TheLogFile;
}
$buildArtifact | Export-DatabaseBuildArtifact -Path "$($config.BuildArtifact)" -force
}
}
<# Now we create the database schema in each empty target database #>
if ($errors -eq 0) #if there were errors, then it gives up at this stage and reports the errors.
{
$PriorSyncErrors = $SyncErrors.count # detect an increase in sync errors
$config.Databases.GetEnumerator() | where Name -notin ('Temporary', 'DataSource') | foreach{
# Sync a database
$DatabaseSettings = $_.Value;
$syncResult =
Sync-DatabaseSchema `
-Source $validatedProject -Target $DatabaseSettings.Connection `
-AbortOnWarningLevel None -SQLCompareOptions $DatabaseSettings.SQLCompareOptions `
-ErrorAction silentlycontinue -ErrorVariable +SyncErrors 3>>$TheLogFile
if ($SyncErrors.count -gt $PriorSyncErrors) #if there was a new sync error
{
$TheError = "$($_.Name) gave a sync error $($connectionErrors[$PriorSyncErrors])";
"$((Get-Date).ToString()) - $($config.'PackageId') $TheError">>$TheLogFile;
$errors++;
break;
}
$DataTargetObject = new-object ("$My.Server") $DatabaseSettings.ServerConnection
$DestinationDatabase = $DatabaseSettings.Database
$DestinationServer = $DatabaseSettings.ServerInstance
if ($DataTargetObject.Version -eq $null)
{ Throw "Can't find the instance $($DestinationServer)" };
$TargetDatabaseObject = $DataTargetObject.Databases[$DestinationDatabase]
if ($TargetDatabaseObject.name -ne $DestinationDatabase)
{ Throw "Can't find the database '$($DestinationDatabase) in instance $($DestinationServer)" };
<# Now we BCP all the table data in. As we are using native mode the utility
disables constraints for the table before doing the import #>
$TargetDatabaseObject.Tables | #for every table
foreach {
$filename = "$($_.schema)_$($_.Name)" -replace '[\\\/\:\.]', '-';
$TheScriptPath = "$($config.DataDirectory)\$($filename).bcp";
if (-not ($TheDataSource.ServerConnection.LoginSecure))
{
$WhatHappened =
BCP "`"$DestinationDatabase`".`"$($_.Schema)`".`"$($_.Name)`"" `
in "`"$TheScriptPath`"" -q -n -N -E "-U$($DatabaseSettings.Username)" `
"-P$($DatabaseSettings.ServerConnection.Password)" "-S$($DestinationServer)";
}
else
{
$WhatHappened =
BCP "`"$DestinationDatabase`".`"$($_.Schema)`".`"$($_.Name)`"" `
in "`"$TheScriptPath`"" -q -N -T -E "-S$($DestinationServer)";
}
if ($WhatHappened -like '*Error*')
{ throw ("$whatHappened $DestinationServer $DestinationDatabase $filename") };
$result = $DataTargetObject.ConnectionContext.ExecuteNonQuery(
"EXEC sp_msforeachtable 'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT all'");
"$DestinationServer $DestinationDatabase has been stocked with data" >> $TheLogFile
}
}
}
if ($errors -gt 0)
{
"$((Get-Date).ToString()) - the build process was aborted">>$TheLogFile;
}
|
Conclusions
The kill 'n' fill database build process, which under the covers is a synchronization with an empty database, is an approach that is fine for a relatively small-scale database, as an alternative to a migration approach, or synchronization with a full database of a different version. I use it after an anonymization process, as it leaves no trace of the original data.
For bigger databases, you will need to adopt the method of restoring a backup, or creating a VM, using Docker or using SQL Clone. There is no single 'best' way of copying databases. It is always good, though, to have plenty of alternative techniques to meet your requirements.
In our simple imaginary example, we have started down the road of achieving what the fictitious developers of the IT Department of AdventureWorks wanted. The also had the ideal of being able to deploy database changes contained in a NuGet package, basically to provide a test database at a previous version as well as the current one. We'll tackle this in the next article.





Loading comments...