Team-based database development with SQL Clone
For most development teams, the database provisioning process involves some element of compromise. Either the process is slow, but the database is realistic or the process is fast but the database unrealistic. Chris Hurley explains why SQL Clone can allow the team to develop and test with a real database, without the compromises.
For most development teams, the database provisioning process involves some element of compromise. Often, developers will work with the up-to-date database schema but only a small sample of test data, for example.
It’s a fast, lightweight approach that is sufficient for most simple tests, but it doesn’t necessarily reveal exactly how the application will behave when dealing with millions of rows of real production data. It means you’ll sometimes encounter some ‘unexpected results’ in production, which are hard to reproduce on the sample data in the test environment.
The alternative is to use ‘real databases’ in the development and test environments. You can perform more realistic testing much earlier in the cycle, spot bugs earlier, and it’s much easier to generate the particular data state needed to replicate an issue.
The compromise, of course, is that it can slow down development and testing processes dramatically. Large databases can take hours to restore from backups, and it may not be possible to have many copies available because of storage requirements.
This often leads to sharing a small number of central, shared development databases, which are infrequently refreshed by a DBA. If you hit a problem, it’s hard to know whether it’s a problem with your code, or the result of someone else’s change. It also makes it very difficult to, for example, set up different databases for different branches or to quickly reset the database to its original state to retry a migration.
This article explains how SQL Clone can allow the team to work with a real database in their development and testing work, but without the compromises. It shows how SQL Clone makes the task of creating new copies of databases near-instant and avoids using up lots of additional disk space.
This is possible because each clone database has a connection to a shared image, which is a copy of the database at a particular point in time. The DBA in the team can use the tool’s PowerShell interface to automate the creation, update and deletion of these clones, as well as deal with issues such as data cleansing, or adjusting logins and permissions for each environment.
Developers can even create and delete their own database clones, whenever they need to, based on images the DBA provides. Suddenly, the database is just a lightweight resource they spin up, integrate into their development workflow, and then tear down again, as required. It’s a big leap from how development teams have traditionally been forced to work with ‘heavyweight’ databases.
Setting up SQL Clone
Setting up SQL Clone for your team involves deciding what you want to install, where. SQL Clone consists of two components: one Server and one or more Agents. The agents do the actual cloning work, and because this requires working with the Virtual Hard Disk service, you need to have an agent installed on each machine running a SQL Server instance you want to use to create images, or which will have clones.
The SQL Clone Server is responsible for coordinating everything that happens, and it should be installed on a shared machine that all the machines running agents can see (as well as any users who are permitted to control the system). The images that you want to share need to be on a file share which all the machines with agents can see, too.
Imagine, for example, that you’ve installed SQL Clone Server on a virtual machine called svr-sqlclone
. You might want to install agents on your shared staging and QA machines, svr-staging
and svr-qa
, and also on some developer machines, dev-sam
and dev-claire
. All of these machines can see svr-sqlclone
on the network, and they can all access \\fileserver\images
.
Once you have the image, you can create as many clones from it as you want, either on one server or on any machine with an agent which can see the image. Each clone takes only seconds to create and only uses the disk space that’s required to store the changes you make. The image file is accessed when reading all of the data you haven’t changed. As this keeps clones independent from each other, you can work on each clone in isolation.
Managing clones and images with PowerShell
DBAs can manage images and clones either by using the SQL Clone web UI or by using the PowerShell cmdlets, both of which they can access from a laptop simply by pointing at the SQL Clone Server. In the team, the DBA can be responsible for creating images and for deploying and managing clones on the shared staging and QA databases.
Each image is a point-in-time copy of a database which can then be used to create multiple clones. The DBA can use SQL Clone’s PowerShell interface to automate image creation from production on a regular schedule. The following PowerShell script shows, minimally, how to automate the creation of images of a sales database, every night, after having set SQL Clone up so that the QA server can access backups in \\fileserver\backups
:
1 2 3 4 5 6 7 8 9 10 |
Connect-SqlClone -ServerUrl 'http://svr-sqlclone:14145' $backupLocation = Get-SqlCloneBackupLocation -Path '\\fileserver\backups' $imageDestination = Get-SqlCloneImageLocation -Path '\\fileserver\images' $imageOperation = New-SqlCloneImage -Name "Sales-$((get-date).ToString("yyyyMMddHHmmss"))" ` -BackupLocation $backupLocation ` -BackupFileName @('Sales-latest.bak') ` -Destination $imageDestination $imageOperation | Wait-SqlCloneOperation |
This image can then be used to rapidly refresh the copies of the Sales database on svr-staging
and svr-qa
using the New-SqlClone
cmdlet.
If the database needs further modification, such as changing permissions or performing data masking, the DBA can extend these PowerShell scripts to make these changes and then create a new image from the clone. The alternative is to restore a backup to a temporary environment in the normal way, perform these changes, and then create an image from that database.
With SQL Clone, keeping the staging and QA servers up to date is now easy, and takes only half the disk space it did before, because both can share the same image. And if it becomes necessary to reset the database on either server back to how it was when the image was taken, that only takes a few seconds, rather than the hours it might take to restore a backup.
Integrating Clone into development workflows
While using SQL Clone in these environments can bring benefits, it’s really when it’s integrated into a development workflow that the biggest benefits become apparent. Before SQL Clone, developers might have had to share a limited number of environments, and if a database needed to be refreshed or reset it could slow things down considerably.
Clones can be created very quickly, and use little additional disk space, so it’s now reasonable for every developer to have multiple copies of the same database accessible from their development machines. This can mean an isolated database for every developer on every branch, and resetting the database after testing some changes takes a matter of moments.
The exact approach depends on the way the DBA and the team choose to work. DBAs who prefer to control deployment of databases onto all machines can configure the permissions of SQL Clone such that only the DBA group can control the system, and create clones on any machine with an agent installed with a few clicks. Dealing with tickets for database refreshes can now take moments, and requests for new environments no longer have to be blocked due to lack of disk space.
Alternatively, DBAs can use SQL Clone to provide a curated set of images from which developers can create clones. For example, they can use PowerShell cmdlets to regularly create an image that contains a copy of the database that’s ready for developer use, having performed any data masking or other manipulations required, and make them available to other members of the team using SQL Clone.
Developers can then log into the SQL Clone web UI and create clones whenever they need them. Whenever they open a new branch, they can make a fresh database based on last night’s image. The schema will be up to date, and they can modify it without worrying about clashing with other team members.
Importantly, because the database has realistic data, migration scripts can be tested against it and, if the migration goes wrong, resetting the database is as easy as deleting the current clone and creating a fresh one.
Alternatively, the DBA might provide developers with PowerShell scripts they can run to create clones. This would allow additional customization to be done to the clone databases after they’re created if, for example, permissions need to be adjusted on a per-developer basis.
Whether using the web UI or PowerShell, it’s always possible for anyone who’s allowed to control the SQL Clone system to see which images are available and where they’re being used to provide clones.
Summary
The best way to integrate SQL Clone into your processes will depend on the way your team works, your network architecture and how you use databases in development. Whichever way you choose to work, eliminating slow and restrictive database restores unlocks new ways of working which can make working with databases just as convenient as working with code.
Find out more about SQL Clone with a 14-day fully functional free trial.
Tools in this post
SQL Provision
Provision virtualized clones of databases in seconds, with sensitive data shielded