PASS Data Community Summit logo

PASS Data Community Summit 2023

Get all the latest announcements direct to your inbox

How SQL Clone works – the technical details

If you have a very large database, up to 64 TB in size, SQL Clone will let you copy, or 'clone' that database many times, very quickly, making the full database available in multiple SQL Server instances across your development and test environments. And yet in each of these instances, the cloned database takes up only tens of megabytes of disk space, and creating each clone takes seconds, rather than hours. How is this possible? It sounds like magic, but it's not; SQL Clone does all this by making clever use only of standard disk virtualization technologies built into the Windows OS.

Two key concepts underpin the basic functions of SQL Clone: images and clones.

An image is a point-in-time representation of your source SQL Server database, which can be created from database backup files (full and differential), or from a live SQL Server database.

SQL Clone creates the image by first creating a Virtual Hard Disk (VHD), and then copying onto it the database's MDF (schema and data) and LDF (transaction log) files. When creating the image from a live database connection, SQL Clone makes the point-in-time copy of database files using the Windows Volume Shadow Copy Service. When creating the image from backup files (SQL Server backup or Redgate SQL Backup), the image is simply created by restoring the backup. An image is immutable and cannot be altered after it's created.

A clone is an isolated, lightweight copy of the database, created from an image. We can create many clones from a single image. For each clone, SQL Clone creates a VHD, called a differencing disk, on the local instance, with a connection back to the image, so each clone created from an image can access all those same image bytes. All that's stored locally, in the differencing disk, are data pages containing changes made directly to the local clone database.

When you modify anything in a clone database, SQL Server writes the change to the differencing disk. Any change you make to a clone database will have an effect only on that clone, and will be isolated from the image. When you query data, the operating system reconstructs the required data pages from the differencing disk plus the database files in the original image. All of this is invisible to SQL Server, which works with the clone just as it would with any other database.

SQL Clone uses the VHDX format (.vhdx), the Hyper-V virtual hard disk format used in Windows 8 and in Windows Server 2012, and upwards, which support virtual hard disk sizes up to 64TB. If one of these newer Operating Systems is installed on the host machine, SQL Clone will detect it automatically, and it will enable users to create clones of databases up to 64TB. If not, it will default to the older VHD virtual hard disk format (.vhd), which has a 2TB limit.

If you need to make a large database available in, say, ten SQL Server instances across your development and test environments, then with traditional database provisioning techniques, you'd essentially need to copy the source database files ten times. SQL Clone's clever use of standard disk virtualization technologies means you only need to copy the source database once, in creating the image. From there, it will 'virtualize the data', making it available in as many clones as you require. This makes it blazingly fast to produce a clone, no matter the size of the source database.