Documentation
Cloning, in general, means "producing similar populations of genetically identical individuals". Similarly, a database 'clone', produced using SQL Clone, is a complete copy of the source database, identical in terms of file layout, metadata, schema, statistics and data.
A clone is a real database in every sense, and SQL Server treats a clone exactly as it would any other database. However, even if your source database is very large, up to 64 TB in size, SQL Clone can make clones available to the teams who need them in minutes, rather than many hours. A database can also be cloned onto a network-connected laptop without causing disk-space problems, provided that the laptop has a reliable connection to the network.
The underlying difference, with SQL Clone, is that we can create a database copy without having to move all the data over to the target server. Instead, SQL Clone exploits disk virtualization technologies in the Windows OS. It creates one full, byte-by-byte copy of the source database, called an image, and from that image it can then quickly create multiple clones, or 'virtualized copies'.
Each clone has full network access to all the data as it existed at the time of image creation, and needs to store locally only the data that has changed. The initial local storage footprint for each clone is therefore only tens of megabytes, and so SQL Clone enables users to clone even multi-terabyte SQL Server databases, very quickly, creating as many identical copies of a source database as required.
SQL Clone is a lightweight tool for copying databases, with centralized and scripted control, and which incorporates data masking to protect personal and sensitive data. Therefore, clones are an efficient way for an organization to make full use of its existing data, when engaged in development, testing and training, while ensuring that all data movement is controlled, and that sensitive and personal data remains protected. Typical uses of clones include:
You should install SQL Clone on a central server that is visible over the network from everywhere you want to use SQL Clone. This central server will host the web application, called SQL Clone Server, through which you can manage the creation, deployment and removal of images and clones.
You only need to install one copy of SQL Clone Server per network, regardless of the number of servers on which you wish to perform SQL Clone operations. SQL Clone Server must be accessible on port 14145 (for its web interface and PowerShell clients) and port 14146 (for communication with agents).
SQL Clone Server uses its associated configuration database to store all the details of images and their associated clones, including properties such as their location, size, when they were created and by whom, as well as the security details regarding who can administer and access the system. It also keeps track of where you've installed SQL Clone Agents.
SQL Clone Agent is the workhorse of SQL Clone, responsible for performing all the work you request through the SQL Clone Server, such as creating an image from a backup file, or creating a clone from an image.
As such, you will need to install a SQL Clone Agent on every server containing databases from which you need to create images, or to which you wish to deliver clones.
We recommend use of a Windows network file share for storing the images. The Clone Agent will need read and write access to this location. Each clone will access the data from its image, in response to any queries, and so it will be beneficial to choose a high-speed, high-bandwidth network location for the image. As clones depend on the data in the image, they will not function if this connection is interrupted. We recommend co-locating images and clones on the same local area network whenever possible.
You will also want to place access controls on this file share similar to those you have in place for protecting files in your production and pre-production database environments. Images only need to be accessible by the user that SQL Clone Agent services run as.
SQL Clone does not currently work with Azure SQL Database (PaaS) because we cannot control the storage tier. However, SQL Clone works fine with Azure virtual machines (IaaS).
No, they don’t all have to be the same. An image persists the database according to the SQL Server version of the instance on which it was created (backups may be upgraded during this process). Providing the target SQL Server instance supports the SQL Server version the resulting images and clones should work. In practice, this means that you can create a clone on an instance running a version of SQL server that is the same or newer than the SQL Server version used to create the image.
SQL Clone uses role-based permissions to regulate access to SQL Clone Server and its functionality. The most powerful role (Admin) give full administrative control over SQL Clone, including the ability to assign individual users, or Windows and Active Directory Groups, to each role. The standard role has permissions to create and manage images and clones, while the clone only role has restricted ability to create clones from certain images.
The impact will be similar to that of creating a backup. A shadow copy is created of the volumes where the database files reside and the files are copied into the image. Changes made during the process are saved in a copy-on-write manner. This can result in a brief I/O pause while the shadow copy is initialized, and some I/O degradation during the process due to the copy-on-write behaviour. There is no impact to the instance once image creation is complete.
We recommend using backups or a temporary staging environment to avoid having to run SQL Clone Agent and incurring these effects on production systems.
Yes, there is a PowerShell interface. PowerShell cmdlets can be downloaded from the SQL Clone Server settings page. You can read more about using the PowerShell cmdlets in our documentation.
Yes, SQL Clone offers a script runner, which allows you to run your own scripts, as part of their standard image creation process, to mask data or change permissions, and so on.
If you are using SQL Provision, then you can use Data Masker for SQL Server to predefine a set of data masking rules (such as substitution or shuffle), which define how each column and row should be obfuscated and anonymized. These rules form your masking set, which SQL Clone can apply as an integral step of a PowerShell-automated provisioning process.
Clones will not work without the source image since all or most of the data will be stored in the image. You should always delete images through SQL Clone Server, which will not permit you to delete images which have active clones.
SQL Clone Server doesn't currently report which clones have active connections, but you can use Activity Monitor in SSMS, or run sp_who2
, or query the sys.sysprocesses system view.
An image is a point-in-time copy of a database and cannot be updated. The advantage of this is that it's very easy to create a clone, run some tests, tear it down, and then immediately create a fresh clone, with the original data. The disadvantage is that capturing the latest production changes requires creating a fresh image from scratch, from the latest backup, although this is easy to automate as an overnight job. We intend to explore what options exist in this area in the future.
Clone data files are created in the default data location for the SQL Server instance on which they are created. You can change this in SSMS.
Yes, you can rename clones by renaming the clone database in SQL Server.
Because the data is split between the image and the clone files, machines with clones must maintain a connection to the image file, and with SQL Clone Server, for clones to work. If you can connect to the network remotely (VPN) then clones will function, provided that the connection is stable, but performance may be affected. If you are unable to access the network, clones will not function. You can control SQL Clone from any machine on which you can reach the SQL Clone Server.
Yes, you can perform performance tuning against a clone, as it will have the same statistics and, in most cases, the same query plans (unless SQL Server upgrades the database) as the original database. Because there will be some additional latency, absolute performance is likely to be lower, but optimising by comparing relative performance is feasible. We would not recommend using clones for realistic load testing, however.
Yes, this works fine. We use virtualized environments including VMWare, VirtualBox and Azure virtual machines during our development and testing, as do many of our users, and have not experienced any problems.