Data Container Revisions, Resets and Graduations
Redgate Test Data Manager allows developers to save each new version of a database as a data container revision. After making local development changes to the container, or running tests, they can instantly reset it to its starting revision. They can also load any previous revision and can even 'graduate' a revision to an image, providing a new baseline for ongoing team development. These techniques are especially effective when used in conjunction with Flyway, which automatically tracks the version of every copy of the database.
Follow the series
What are Clones? – Get Started with Clones – ▶ Development and Testing with Clones
One of the great advantages of using data containers for database development is the ease and speed with which you can create or reset them. In Redgate Test Data Manager, only the development changes made to each container require extra storage space on disk (in a diff file, per container). This minimizes the hardware resources needed for storing database files and makes it possible to have a separate container for every current revision of a database under development. It also makes it very easy to undo database changes, just by resetting the container to the latest saved revision.
This minimizes the need to do development work within transactions or, if you are using an RDBMS like MYSQL that can’t rollback DDL in a transaction, it removes the reliance on undo migration files, or the need to restore backups.
Overview of the save, load, reset and graduate commands
Using Redgate Test Data Manager’s cloning CLI (rgclone
), you can save revisions of a data container, using the save
command:
rgclone save data-container <Name-Of-The-Container>
This allows you to access several previous versions via the load
command, specifying the revision you need, in this case rev.0
which means “as originally created from the data image“:
rgclone load data-container <Name-Of-The-Container> --revision=rev.0
If you’re working on the latest version of the database, say, you might make some additional changes to try out an idea for a feature, or check a possible performance enhancement. You can then discard the changes instantly, by resetting the data container to the previous saved revision:
rgclone reset data-container <Name-Of-The-Container>
This is the operation that allows us to perform a teardown after running any tests that require modifications to the database, either schema or data, without needing to create a new data
Finally, you can use the graduate
command to create a new image from the current revision of a data container, or from a revision that you specify:
rgclone graduate data-container < Name-Of-The-Container Or_ContainerID> --revision <RevisionID> --name <GraduatedImageName>
Working with images, containers and revisions
The way you use a data container depends on your database methodology, but they are always best considered as volatile copies of the database that are used for a single process, such as a test-run, or a branch development, and then destroyed.
We can use one container per branch and several developers could connect to it and make changes, such as by running Flyway migrations. The owner of the data container, as authorized by the authentication token, can save revisions. In practice this would mean that the token would be ‘owned’ by the process that automated the process of saving each Flyway version as a revision. However, we’ll save the topic of automation to the next article. Here, we’ll just be executing PowerShell code from the command line.
The test drive
The demo PowerShell code that follows uses rgclone
to create a data image, and from it a data container hosting an empty Pubs database. After connecting to the data container from Flyway and migrating the database to a new version, it will save the data container as a revision and then demonstrate the use of the load
, reset
and graduate
commands, discussed above.
I’ll show the full code listing later in the article, but first let’s take it for a test drive. The code starts by creating an image definition (YAML) file that uses a prescript to create an empty Pubs database. Thergclone
CLI uses this file to create a data image, and from this image a data container, which is a working database instance (a SQL Server instance, in this example) running in a container. You can see previous articles such as Getting Started with Cloning in Redgate Test Data Manager for details.
We then connect to the data container from Flyway and execute the flyway
info
command. It confirms that it’s an empty database and that there are three pending migration files in the Flyway project:
PathToPubs> Get-content $Credentials -Raw | flyway info -reportEnabled=true -configFiles=- Loaded configuration from standard input Schema history table [master].[dbo].[flyway_schema_history] does not exist yet Schema version: << Empty Schema >> +-----------+---------+----------------------------+------+---------+----------+ | Category | Version | Description | Type | State | Undoable | +-----------+---------+----------------------------+------+---------+----------+ | Versioned | 1.1 | FirstRelease | SQL | Pending | Yes | | Versioned | 1.2 | SecondRelease1-1-3to1-1-11 | SQL | Pending | No | | Versioned | 1.3 | ThirdRelease1-1-11to1-1-15 | SQL | Pending | No | +-----------+---------+----------------------------+------+---------+----------+
So, we then use flyway
migrate
to migrate the Pubs database to the latest version, which is v1.3:
PathToPubs> Get-content $ConfigFile -Raw | flyway migrate -configFiles=- (deleted verbose content) Successfully applied 3 migrations to schema [dbo], now at version v1.3 (execution time 00:36.657
We now save
this new version as a data container revision:
PathToPubs> rgclone save data-container $ContainerName Successfully saved data container 'Pubs-TestRevisioning-mssql-develop-container' (55) New revision is 'rev.1'
Whenever we need a fresh copy of v1.3 for testing, we simply load
it as a rev.1
data container. Since rgclone
gives the initial data container a revision of rev.0
, we can also the load
command to revert to an empty database, rather than use the flyway
clean
command (though in practice it’s sometime faster to do a clean):
PS S:\work\Github\FlywayTeamwork\Pubs> rgclone load data-container $ContainerName --revision=rev.0 Successfully loaded data container 'Pubs-TestRevisioning-mssql-develop-container' (55) New revision is 'rev.0-eivl.0'
We can prove that we once more have the empty database by running flyway
info
…
PathToPubs> Get-content $Credentials -Raw | flyway info -reportEnabled=true -configFiles=- Loaded configuration from standard input Schema history table [master].[dbo].[flyway_schema_history] does not exist yet Schema version: << Empty Schema >> +-----------+---------+----------------------------+------+---------+----------+ | Category | Version | Description | Type | State | Undoable | +-----------+---------+----------------------------+------+---------+----------+ | Versioned | 1.1 | FirstRelease | SQL | Pending | Yes | | Versioned | 1.2 | SecondRelease1-1-3to1-1-11 | SQL | Pending | No | | Versioned | 1.3 | ThirdRelease1-1-11to1-1-15 | SQL | Pending | No | +-----------+---------+----------------------------+------+---------+----------+
…but then instantly recreate a working version v1.3 by loading rev.
1. No need to run flyway
migrate
:
PathToPubs> rgclone load data-container $ContainerName --revision=rev.1 Successfully loaded data container 'Pubs-TestRevisioning-mssql-develop-container' (55) New revision is 'rev.1-mgxy.0'
The database is now back at version 1.3. The Flyway Schema History table is preserved with the database, so there is no need for any action to keep Flyway in sync with the container.
Imagine that you’re working on a feature branch, in version control, and have previously saved your progress as a rev.1
data container. You’ve subsequently made some rash changes and need to undo them quickly. In this case, I’ll just ‘accidentally’ clean the database, removing all objects:
PathToPubs> Get-content $Credentials -Raw | flyway clean -configFiles=- -reportEnabled=true Loaded configuration from standard input Successfully dropped pre-schema database level objects (execution time 00:00.083s) Successfully cleaned schema [dbo] (execution time 00:04.233s) Successfully cleaned schema [classic] (execution time 00:00.862s) Successfully cleaned schema [accounting] (execution time 00:00.591s) Successfully dropped post-schema database level objects (execution time 00:00.357s)
A flyway
info
command will confirm that it’s all gone. Sweating slightly, I run the reset
command:
PathToPubs>rgclone reset data-container $ContainerName Successfully reset data container 'Pubs-TestRevisioning-mssql-develop-container' (55)
Running flyway
info
again confirms that I’ve successfully reverted the database to starting revision point:
PathToPubs> Get-content $Credentials -Raw | flyway info -reportEnabled=true -configFiles=- Schema version: 1.3 +-----------+---------+------------------------------+--------+---------+----------+ | Category | Version | Description | Type | State | Undoable | +-----------+---------+------------------------------+--------+---------+----------+ | | | << Flyway Schema Creation >> | SCHEMA | Success | | | Versioned | 1.1 | FirstRelease | SQL | Success | Yes | | Versioned | 1.2 | SecondRelease1-1-3to1-1-11 | SQL | Success | No | | Versioned | 1.3 | ThirdRelease1-1-11to1-1-15 | SQL | Success | No | +-----------+---------+------------------------------+--------+---------+----------+
There are times when a developer might decide that the latest data container revision should be ‘graduated’ to a data image so the team can then use it as the ‘baseline’ for ongoing work. This is especially useful if it takes a while to migrate a database from ‘clean’ to the current version.
PathToPubs> rgclone graduate data-container $ContainerName --revision=rev.1 --name "$($ContainerName)-V1.3" Successfully graduated data container 'Pubs-TestRevisioning-mssql-develop-container' at revision 'rev.1' to a n ew data-image New data image 'Pubs-TestRevisioning-mssql-develop-image-V1.3' (62) available!
Now, it would make a lot of sense when working in a feature branch to have a system that creates a revision for every successful migration within the branch. You wouldn’t want to store historic migrations, only the ones you’d want to revert to in testing. We’ll be showing, in a subsequent article, how to do this within a callback so that you can subsequently ‘undo’ to them.
The PowerShell script
In the meantime, here is the PowerShell script that I used to create the necessary environment and resources for this article so you can ‘play along’:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
$Flywaylocation='C:\ProgramData\chocolatey\lib\flyway.commandline\tools\flyway-10.16.0\flyway.cmd' if (!(Get-Command 'Flyway' -ErrorAction SilentlyContinue)){ Set-Alias -Name 'Flyway' -Value $Flywaylocation } <# Set Powershell to save files in UTF-8. This keeps you out of trouble when writing files that are then read by Flyway. #> $PSDefaultParameterValues['Out-File:Encoding'] = 'utf8' <# exit codes for RGClone. We need these not just to find out what went wrong but also because RGClone produces an error for authentication, but there is currently no way of querying as to whether the user is still authenticated before you do an operation. #> $RGCloneExitCodes= @( @{'ExitCode'=0;'Reason'='Success'; 'Explanation'='rgclone command completed successfully'} @{'ExitCode'=1;'Reason'='Unexpected Error'; 'Explanation'='Used when no other error code is suitable. It''s the default error code.'} @{'ExitCode'=2;'Reason'='Resource Not Found'; 'Explanation'='a resource (e.g. data image or data container) could not be found.Can be an error (e.g. you try to get a non-existent data image or container – e.g. rgclone get dc 17) or not (rgclone get di will return 2 when no data images exist).'} @{'ExitCode'=3;'Reason'='Unrecognised Command'; 'Explanation'='arequested command is not recognised (e.g. rgclone idonotexist).'} @{'ExitCode'=4;'Reason'='Operation Failed'; 'Explanation'='The rgclone operation failed (usually server-side).'} @{'ExitCode'=5;'Reason'='Output Error'; 'Explanation'='an error occurred while writing to the output stream.'} @{'ExitCode'=6;'Reason'='Resource Not Specified'; 'Explanation'='a resource (data image or data container) was not specified for an operation.'} @{'ExitCode'=7;'Reason'='Authentication Failed'; 'Explanation'='An authentication operation failed (in any rgclone command, not only auth).'} @{'ExitCode'=8;'Reason'='Ctl Error'; 'Explanation'='An error occurred while configuring the rgclone tool itself. This is typically an internal only error and in that case indicates a bug in the product.'} @{'ExitCode'=9;'Reason'='Forbidden Error'; 'Explanation'='a user does not have access to the requested resource (e.g. data image or data container) or operation (e.g. delete).'} @{'ExitCode'=10;'Reason'='Update Check Error'; 'Explanation'='the rgclone check for update has failed (typically when communications with the server are not successful).'} ) # rgclone auth # We need to make the flyway directory our current working directory $WorkDirectory="<pathTo>\Pubs" #I'm using the Database-Project-RDBMS-branch(-Version) convention $CloneProject='Pubs-TestRevisioning-mssql-develop' Cd $WorkDirectory $ImageName="$CloneProject-image" #We create the name of the image $ImageSpecFilename="$env:USERPROFILE\$($ImageName)Spec.yaml" # any YAML file name will do # it will be saved in the user area-change the line above if you have another preference $ContainerName = "$CloneProject-container" #we create the YAML file that defines the image and the name of the database @" sourceType: empty name: $ImageName engine: mssql version: 2019 preScript: CREATE DATABASE PUBS; "@ >"$ImageSpecFilename" #write the YAML to disk (We can reuse etc) $output = rgclone create data-image -f $ImageSpecFilename # we check the return code just in case we get something wierd. $Result=$RGCloneExitCodes|where {$_.ExitCode -eq $LastExitCode} if ($result.ExitCode -ne 0) {Throw "That didn't end well $($result.Reason)`n $($result.Explanation)"} else {Write-verbose "$output"} # we will keep the secrets such as connection and credentials in the user area $secrets="$env:USERPROFILE\$ContainerName($(Get-Date -UFormat "%A-%d-%b-%Y")).json" # we will keep the clone going but normally you'd set a default to expire it rgclone create data-container --image $ImageName --name $ContainerName --lifetime 0 --output json >$Secrets # we check the output in case $Result=$RGCloneExitCodes|where {$_.ExitCode -eq $LastExitCode} if ($result.ExitCode -ne 0) {Throw "That didn't end well $($result.Reason)`n $($result.Explanation)"} else {Write-verbose "Data Container $ContainerName created from $imageName."} # Now we will read in the secrets to create the flyway configuration file $ConnectionDetails=Get-content $Secrets -Raw|ConvertFrom-json # now stock the variables we need to create the conf file $engine=$ConnectionDetails.engine; $Name=$ConnectionDetails.Name; $Tail= "$(if ($engine -like 'MSSQL') {";databaseName=$DatabaseName;trustServerCertificate=true;"} elseif ($engine -like 'postgresql') {"/$($DatabaseName.ToLower())"} else {"/$DatabaseName"})" if ($_.engine -notlike 'oracle') {$JDBCURL=(($ConnectionDetails.jdbcConnectionString -ireplace '[;?]user=.{1,200}', '') -ireplace '(?<!/)/[^/].{1,100}', '') +$tail} else {$JDBCURL=$ConnectionDetails.jdbcConnectionString} #if you have changed the format of the $CloneProject , then you'll # need to change this $Credentials="$env:USERPROFILE\"+(($name -split '-'|select -first 4) -join '_')+'.conf' @" # connection details for $($name) flyway.url=$JDBCURL flyway.user=$($ConnectionDetails.user) flyway.password=$($ConnectionDetails.password) "@>$Credentials Write-verbose "written out flyway connection and authentication details to $Credentials " # you'll need this if you want to determine the current version after the operation <# So we then do the migration to version 1.3 #> <a id="post-6985113-_Hlk172631010"></a>Get-content $Credentials -Raw | flyway clean -configFiles=- -reportEnabled=true #if the report of the version was there, then read it in if (Test-Path -Path 'report.json' -PathType Leaf) {$CurrentVersions=Get-content 'report.json' -Raw|ConvertFrom-json} else {write-error "The report file 'report.json' wasn't generated"} $DatabaseVersion = $CurrentVersions.individualResults[0].targetSchemaVersion <a id="post-6985113-_Hlk172631076"></a> <#Use flyway to migrate the database to a new version # Get-content $Credentials -Raw | flyway migrate -configFiles=- -reportEnabled=true <#We now save this so that we have a version 1.3 that we can load as 'rev.1' whenever we want a clean copy for testing. (It is necessary for the reset, graduate and load Commands). Rev.0 should give us the empty database. #> rgclone save data-container $ContainerName <# Now we use RGClone to load version 1.3, which we saved as rev.1 #> rgclone load data-container $ContainerName --revision=rev.1 #should result in V1.3 <# The Reset commend will restore the database to its most recent revision #> rgclone reset data-container $ContainerName <# To check the current version of the database #> Get-content $Credentials -Raw | flyway info -reportEnabled=true -configFiles=- <# Well, I can always 'graduate' any of the revisions of my data-container. This means that I can create a data image from any of them. I'd want to do this if I want to use its current state as the baseline for future work, and if it takes a while to migrate a database from 'clean' to the current version. All we need to do is to create suitable names so things don't get lost or over-written #> $NewName="$((($ContainerName -split '-')| select -first 4) -join '-')" $GraduateName="$NewName-V1.3-Image" $NewContainerName="$NewName-V1.3-Container" $Newsecrets="$env:USERPROFILE\$NewContainerName($(Get-Date -UFormat "%A-%d-%b-%Y")).json" Write-verbose "graduating container '$ContainerName' to image '$GraduateName'"; rgclone graduate data-container $ContainerName --revision=rev.1 --name $GraduateName; Write-verbose "Creating new container '$NewContainerName' from image '$GraduateName'"; rgclone create data-container --image $GraduateName --name $NewContainerName --lifetime 0 --output json >$Newsecrets ; |
Summary
Data containers provide some obvious advantages over using a conventional database for database development, especially when testing, or doing branch development. They are geared to the lone process or developer rather than a shared team development.
The save
command of the rgclone
CLI allows the database developer to save a particular version of the database, running in a container, as a revision. The reset
command reverts a data container quickly to the starting revision. With the load
command, a database can be loaded repeatedly at any revision. A revision can be saved as an image, via the graduate
command, to create copies for testing, or for later use as ‘references’ to establish what exactly is in each version of a database.
With data container, a branch can be isolated, developed, tested and then discarded with a delightful sense of irresponsibility, and without fear of affecting other lines of development until the merge. The obvious initial use is for testing, making the stages of setup and tear-down far quicker and easier. However, the opportunities that are presented for isolated branch development are hard to ignore. Containerized databases make the whole process inherently tidier and easier to manage, and Redgate Test Data Manager’s rgclone
CLI makes it all practical without straining network resource.