5 reasons why backup and restore doesn’t cut it in dev and test
If developers and testers work with a database that has similar data characteristics and is of a similar size to production, it will lead to fewer surprises when deploying database changes and help the team troubleshoot urgent production bugs or performance issues. Likewise, if developers have an easy way to spin up multiple database copies, run tests, then revert the database back to the original state quickly, they can achieve far more within a time-limited test cycle, and so drive up the quality of your software.
However, whenever developers or testers need to create or refresh their local database copies, they will often need to wait for a DBA to provide them with the latest database backup, cleansed of any personal or sensitive data. Each time they need to create a fresh copy, or each time they want to revert a database to its original state, it will involve restoring a backup.
This places a strain on the administrative team who must provide the sanitized backups, and also restricts the speed at which developers and testers can work. This article highlights the five reasons why backup and restore as a provisioning process can cause unnecessary pain for you and your organization, and suggests an alternative approach using SQL Provision.
- Time
- Resources
- Compliance
- Bottleneck
- Management
Time
Depending on the size of the original database, restoring a database backup can take hours per copy. For very large databases, we’ve encountered development teams who only get a full database refresh once a quarter or less. Development and test teams could be waiting a long time before they can refresh outdated dev and test databases. Every time they need to revert a database to its previous state, following one series of tests and in preparation for the next, then again they will need to wait for a restore operation to complete. This can slow down the speed at which database problems are resolved or changes are made and tested, which in turn restricts the speed at which the business can adapt to business new requirements.
Resources
As well as costing time, another resource that feels the strain of a backup and restore process is disk space. Storage is expensive and having a team of 15+ developers who all want to work on their own sandbox environment can result in a costly bill or a team of disgruntled developers forced to work in a shared model. Imagine a 30TB production backup needing to be restored to test, QA, staging and development environments for each of your team members. We can see how this would quickly consume costly disk space, and could become a blocker for a team requiring multiple environments to work with.
Compliance
In Redgate’s 2018 Data Governance Implementation Survey, we learned that 72% of SQL Server professionals work in an organization that is subject to legislation such as HIPAA, POPIA or the GDPR. Therefore, it’s vital for these organizations to mitigate any risks where sensitive information could be potentially breached. The truth is, real customer or user data is rarely needed outside production – it’s the size, shape and demographic distribution that’s important for the purposes of testing. Having multiple copies of the production database available in pre-production environments just increases the attack surface and the chances of leaks due to human error. Any subsequent data breaches can damage an organization’s reputation and could also result in a large fine. The DLA Piper data breach survey in February 2019, for example, concludes: “We anticipate that 2019 will see more fines for tens and potentially even hundreds of millions of Euros as regulators deal with the backlog of GDPR data breach notifications.”
Despite this, 65% of participants in Redgate’s 2019 State of Database DevOps Report revealed that they use a copy of production data in pre-production. Protecting sensitive data like PII or PHI in production restores can be achieved through a masking model, but only 36% advised they did so and even this is not without its complications. While homemade masking scripts can anonymize sensitive data, they may not be fit for scale which can affect the demographic distribution of the data, nullifying the point of a production restore.
Bottleneck
In most organizations, the DBA or architect is responsible for provisioning development and test servers, which can cause a bottleneck. Developers are unable to self-serve and must wait on the availability of the DBA to provide the copy for them. As highlighted earlier, the process can also take hours or days, so imagine a DBA having to do this for a team of ten or more developers. Without automated scheduling or self-serve functionality, the DBA’s time will be taken away from important tasks like performance tuning, while the developers will be prevented from testing their changes as soon as they are made.
Management
Lastly, it’s important to have control over the various database copies being distributed to team members. With backup and restore, this can be challenging. There is no central view in which environments can be easily and quickly refreshed, or that provides a clear picture of where production copies are or whether the copies are masked.
Without this kind of management view, precious disk space could be taken up by forgotten copies and the attack surface increased unnecessarily. How do you know for sure that your team are working on the same, up-to-date version for dev and test? If a legislation auditor visited your organization, could you confidently state where all the sensitive data resides, or that production copies are suitably masked? Answers to these questions shouldn’t cause a sweat but instead be easily available and accessible in one centralized place.
Whether one or all of the points above are applicable to you and your organization, there are solutions available which can help you address the issues being raised.
SQL Provision
SQL Provision from Redgate leverages Microsoft’s proven virtualization technology and combines it with comprehensive data masking capabilities. This enables copies of production databases to be created and restored to non-production environments in seconds, using only a fraction of disk space, with the data in those copies automatically masked. Importantly, even if data masking is applied, the final data will retain its referential integrity and original distribution characteristics.
The process can also be automated and scheduled to run overnight, allowing for developers to start every day with a fresh, up-to-date copy. SQL Provision’s built-in dashboard provides a clear overview of environments, copies, self-serve functionality and reset features, so you can sleep soundly at night knowing developers can quickly spin up refreshed environments to the requirements you’ve specified.
As a measure of its value, KEPRO, a leading healthcare provider in the US, replaced its backup and restore process with SQL Provision and managed to save 15-20 hours a week in its provisioning process, along with terabytes of data. KEPRO was able to get its new offshore development team up and running quickly and smoothly, while complying with the necessary data privacy regulations. The team could self-serve and refresh their environments as and when they needed, freeing up the DBA’s time to focus on more proactive tasks for the organization.
The provisioning process is a critical part of database development. Introducing a technology that can remove the pain points of backup and restore and replace them with a more efficient solution can hugely benefit your organization and the quality of life of you and your team.
To learn more, visit the SQL Provision product page.
Tools in this post
SQL Provision
Provision virtualized clones of databases in seconds, with sensitive data shielded