PASS Data Community Summit logo

PASS Data Community Summit 2023

Get all the latest announcements direct to your inbox

Documentation

The Key Problems SQL Clone Solves

Any small database can be copied by restoring it from a backup, or attaching the data and log files, or by building it and then importing data. However, these methods become less and less practical as the database increases in size. Network admins will need to increase the available storage, and the physical process of copying the data will take many hours, every time.

Also, unless the data can be effectively anonymized beforehand, the backup/restore method is fraught with risk because the copies cannot be restricted, terminated or audited. This means that data protection and privacy concerns will restrict the free movement of at least part of the data to the business activities that need it.

The inevitable compromise is that teams work with 'stale' copies of the data, or are restricted to using only small sets of artificial test data, which often doesn't reflect the characteristics of the 'real' data.

Why is SQL Clone a better way of copying databases?

SQL Clone accelerates the delivery of database changes by making available a lightweight mechanism for copying databases, with centralized and scripted control, and which incorporates data masking to protect personal and sensitive data. It allows teams to manage the copying of databases, responsibly, and without blocking their development and release processes, or introducing an unrealistic administration burden.

What problems does it solve for the Operations team?

SQL Clone removes much of the logistical and administrative burden associated with copying large databases, and allows the team to introduce governance to database provisioning.

The need to make only one full copy (image) of the source database minimizes the impact on disk storage and network resources. For example, creating ten copies of a 500 GB database requires a total of around 500 GB of initial extra storage space, rather than 5 TB. Also, making a clone of a 500 GB database takes a few seconds rather than about 5 hours.

SQL Clone provides not only a viable way to copy data around, but also the means to do so responsibly, in compliance with data protection and privacy. With SQL Clone, the DBA can run data masking sets as an integral part of the image creation process. This means that sensitive data, such as credit card numbers, can be removed or obfuscated, and personal data modified such that it can no longer be directly attributed to a specific data subject.

Using SQL Clone Server, the DBA can control database movement securely. They can see what clones exist on what servers, and who created them. They can run scripts to remove clones automatically. SQL Clone's access permissions allow control over who can create and manage clones. Using built-in PowerShell integration they can automate the process of deploying a clone of the most recent, anonymized, image of the production database, such that it is repeatable, auditable and visible.

What problems does it solve for the Development team?

Developers and testers occasionally need to work on databases, or the applications that depend on them, where the data has the same volume and characteristics as the current production databases. Occasionally, only the real production data will allow the developer to replicate the production issue in the development and test environments. SQL Clone can make it quicker and easy for a developer to get a fresh copy of the 'real' data, while knowing that any constraints regarding security and pseudonymization have already been handled.

With SQL Clone, the provisioning is so easily automated and secured that developers and testers can spin up clones just for the time they are needed, and then tear them down; a 'real' database becomes just another standard resource that can be imported, used and reset, as required.

SQL Clone is versatile enough to support the way that development teams choose to work, which brings all sorts of benefits to the team. It means, for example, that each developer can work in an isolated database development environment, complete with 'real' data. Also, many development tasks such as unit testing, basic build and integration testing can be more comprehensive, within existing time constraints, because database tests can be run in parallel.

What problems does it solve for the business?

SQL Clone speeds up the process of database build and 'tear down', minimizes the demand on time and network resources, and make the process secure, controlled and auditable.

This gives an organization the freedom to make full use of its existing data when engaged in development, testing and training. Data can be exported safely to the BI team for reporting and analysis, as well as to the development and test teams.

By allowing the Operations team to set up an integration environment that reflects more closely the production environment, and the development teams to incorporate realistic data sets into their testing cycles, bugs and performance problems are caught much sooner, and don't creep into the production applications.

By reducing the overhead of the provisioning process, there is more time for the important development tasks of optimizing the support, monitoring, compliance, security and performance characteristics of the functionality they plan to deliver to production. It also means that the development work is deployed earlier and more often, so it is more visible to other teams within IT. The net result is far fewer unpleasant surprises.