How to build-and-fill multiple development databases using PowerShell and SQL Data Generator
This article offers a build-and-fill method for development databases, where each developer will subsequently want to alter the data or metadata in his or her copy of the database.
When you are working with a database, you always need data. This is why you need to stock the database with data after you build it. Sometimes, you just want a large number of made-up customer details, sales figures or the like. You also will need columnar data, sometimes known as ‘static data’ or ‘enumerations’, saved in a file. Occasionally, you need data that is generated statistically using a suitable language such as Python.
My previous article about automatically filling your SQL Server test databases with data, explained how you can use PowerShell and SQL Data Generator (SDG) to automate data provisioning for test databases. The assumption there was that you want to build the database, fill it with data, run the tests, and tear it down again. You’ll never want to modify the test database.
Here, I’m going to show how to extend this build-and-fill method to development databases, where each developer will subsequently want to alter the data or metadata in his or her copy of the database. To do that though, we’ll need a .sqlgen file for every development copy of the database.
Working on a single, shared database
Let’s say the development team uses a single, shared database for development work. You can use the SDG GUI to specify where you want the data taken from, or how you want it generated, for the shared database. Having devised a data generation plan for each table, you can save the associated .sqlgen project file into source control, with the same version number as the database. In this way, you can build the latest version of the database, from a build script in the VCS, and then fill it with data, all in one step.
Each time you make a change to a table, or add a new one, you will also need to use SDG in interactive mode to update the method of generating the data for the changed table, or create a new data generation strategy for the new table. It is a very good idea to get into the habit of treating as a single operation the table changes and the strategy for re-filling the altered tables. In other words, don’t do the first without working out how you going to do the second, and then commit the table changes and the changes to the SDG file to source control at the same time.
This allows the team to work on any previous version of the database from source control, merely by building a previous version from the VCS, and using the corresponding version of the .sqlgen project file to fill it.
Generating a SDG file for each database copy
Things get a little more complicated if, for example, each developer has a dedicated development database that they need to be able to subsequently modify. The problem with SQL Data Generator is that it assumes you are using a single database on a specific server, both of which are specified in the project file.
Let’s say you have the ‘master’ copy of the SDG file, for a specific version of a database, saved in GitHub. You’ve used the script in my previous article to build-and-fill a couple of new copies of this database, using this master file, but you subsequently need to alter the schema of one of the database copies. When you try to open the SDG project to update the data generation plan accordingly, you’ll receive an error.
What you need is a separate SDG project file for each copy of each version of the database you are working on, whether you are using it for test, sandboxing or development. Each of these copies of the SDG file would also be saved to the VCS.
The script in Listing 1 will generate a separate SDG project file for each of the specified database copies. It doesn’t matter what particular server or database (or password) details are stored in the master SDG file, in GitHub, because we can override them from the command line.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
$MyDatabase = 'Customers' #specify the default name of the database. You can over-ride this if you have #a test cell with a lot of different identical versions with different names on one server #specify a list of all the copies of the database that need project files $Databases=@( @{'Server'='MyTestSQLInstance1';'Alias'='test1Dozy';'Database'="$MyDatabase"}, @{'Server'='MyTestSQLInstance2';'Alias'='test2Beaky';'Database'="$MyDatabase"}, @{'Server'='MyDevSQLInstance';'Alias'='Dave';'Database'="$MyDatabase"} ) #change this location to suit your VCS system $MySQLDataGeneratorFileLocation = "$($env:HOMEDRIVE)$($env:HOMEPATH)\Documents\GitHub\$MyDatabase" #you might have a different convention $MySQLDataGeneratorFile="$MySQLDataGeneratorFileLocation\$MyDatabase.sqlgen" #now we just read in the master SQL Data Generator project file [xml] $xdoc = get-content "$MySQLDataGeneratorFile" #we get the datasource section $dataSourceSpec=$xdoc.SelectSingleNode(“//DataSource”) #for each database you specify... $Databases.GetEnumerator() | foreach{ #create a version of the project file with the database details $dataSourceSpec.ServerName = $_.Server $dataSourceSpec.DatabaseName=$_.Database $dataSourceSpec.IntegratedSecurity='True' $xdoc.Save("$MySQLDataGeneratorFileLocation\$($_.Alias).sqlgen")# and save it to source control } |
Listing 1
You must include in this script each location of the database, when creating a new version of the database. Having done so, you can then use any of these SDG project files to access the database within the SQL Data Generator GUI. The one you choose depends on which copy of the database you need to access to tweak the data generation process. Your tweaks will then need to be merged into the master copy of the SDG project file, for that database version, otherwise they won’t be saved.
Adding the build-and-fill processes to the routine
You will probably have spotted that it is easy to modify this code so that you can build each copy of the database, and fill it with data, at the same time as generating the its associated project file.
So, here is the script. Note that the database build script deletes any existing versions of the objects as part of the build.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
pushd; Import-Module sqlps -DisableNameChecking #load the SQLPS functionality popd; # Firstly, we create the aliases to make the execution of command-line programs easier. Set-Alias SQLDataGenerator 'C:\Program Files (x86)\Red Gate\SQL Data Generator 3\SQLDataGenerator.exe' -Scope Script $MyDatabase = 'Customers' #specify the default name of the database. You can over-ride this if you have #a test cell with a lot of different identical versions with different names on one server #specify a list of all the copies of the database that need project files $Databases=@( @{'Server'='MyTestSQLInstance1';'Alias'='test1Dozy';'Database'="$MyDatabase"}, @{'Server'='MyTestSQLInstance1';'Alias'='test2Beaky';'Database'="$MyDatabase"}, @{'Server'='MyDevSQLInstance';'Alias'='Dave';'Database'="$MyDatabase"} ) #change this location to suit your VCS system $MySQLDataGeneratorFileLocation = "$($env:HOMEDRIVE)$($env:HOMEPATH)\Documents\GitHub\$MyDatabase" #you might have a different convention $MySQLDataGeneratorFile="$MySQLDataGeneratorFileLocation\$MyDatabase.sqlgen" #this will be somewhere else $MyReportDirectory = "$($env:HOMEDRIVE)$($env:HOMEPATH)\Documents\GitHub\$MyDatabase\" #now we just read in the master SQL Data Generator project file [xml] $xdoc = get-content "$MySQLDataGeneratorFile" #we get the datasource section $dataSourceSpec=$xdoc.SelectSingleNode(“//DataSource”) #for each database you specify... $Databases.GetEnumerator() | foreach{ #create a version of the project file with the database details $dataSourceSpec.ServerName = $_.Server $dataSourceSpec.DatabaseName=$_.Database $dataSourceSpec.IntegratedSecurity='True' $xdoc.Save("$MySQLDataGeneratorFileLocation\$($_.Alias).sqlgen")# and save it to source control Invoke-Sqlcmd -serverinstance $_.Server -InputFile $MyBuildScript | Out-File -filePath "$MyReportDirectory\$($_.Alias)Build.rpt" sqldatagenerator /project:"$MySQLDataGeneratorFileLocation\$($_.Alias).sqlgen" | Out-File -filePath "$MyReportDirectory\$($_.Alias)Generate.rpt" } |
Listing 2
Of course, because you are using SQLPS to get at the invoke-SQLCMD
command, you can use SMO for all sorts of operations that are specific to your build. If you do so, it is worth creating a server object for each server you install on, and pass that object to the invoke-SQLCMD
command.
Summary
The technique described in this article is simple but quite powerful. It allows you to automate the process of building, and then filling, multiple copies of the same version of the same database.
By creating a separate SDG project file, per database copy, developers can make changes to their own copy of the database and then, in the same step, update the data generation plan to reflect that change. They will then commit the database change and the changes to the data generation plan, in a single step. As long you subsequently merge changes to individual SDG project files back into the master SDG file, for that database version, you’ll always be able to build and fill any version of the database.
If you’d like to explore more of what you can do with PowerShell and SQL Data Generator, download a fully-functional free trial.