A common thread behind the software we develop is a problem or issue we’ve come across in the software development process. We then adopt the software ourselves, which gives us a real understanding of the features that need to be developed further. To gain a deeper insight into SQL Clone, we spoke to Joseph Woodhouse, our IT Manager, to find out about how it’s being used to implement a new database provisioning process at Redgate.
As Redgate’s IT Manager, what are your responsibilities
regarding the distribution of database copies?
I’m accountable purely for data security, ensuring our customer data is protected. I have to make sure the process we use to distribute copies of our databases, for our own use, guarantees those copies are sanitized, and don’t contain customers’ personal information. It’s also important that even the sanitized copies are only accessible to those people within the business who really need them.
What databases do you
need to copy, and why?
The main database we need to copy is the one at the back end of the Redgate website, which contains information about our tools, but also some customer data. The database is several hundred gigabytes in size. Several areas of the business, including our BI team, occasionally request copies, but primarily we receive requests from our web development team when they need to troubleshoot a reported website issue, or test out a new feature.
Can you describe briefly Redgate’s
old way of provisioning this database?
Simon, our DBA, would receive quite a few ad hoc requests for a copy of the latest version of the database. We take nightly database backups using SQL Backup, so his first job was simply to restore the latest backup. Next, he’d need to sanitize that data manually. We have a policy in place for what types of data can and can’t be distributed, so he’d run scripts to sanitize the data in any columns containing Personally Identifiable Information (PII) such as names, addresses, phone numbers, IP addresses, and so on.
Finally, he’d take a backup of the sanitized database, drop it onto a shared drive, and notify people that it was there. From that point, the developers would copy the sanitized backup to their test server, and restore it.
What were the main issues with
this way of provisioning databases?
The big problem for me was simply that the process was ad hoc and manual. This meant it just wouldn’t stand up to any external audit. With a manual process, you can’t prove that certain steps aren’t ever skipped by accident, so if you’re trying to convince an auditor the data is always cleansed, always secured; no way. The upcoming General Data Protection Regulation (GDPR) meant I needed a process that was standard and generic, and that was automated so we could track and audit it.
It was also taking up too much of Simon’s time. He’s our only DBA and I needed him to be focused on tasks that were important to us strategically, and not wasting time manually copying and sanitizing databases.
Is the new database provisioning
process fully automated?
Yes, the whole process of delivering a sanitized copy of the database to a shared drive is now automated, and uses SQL Data Generator.
Also, having delivered the sanitized backup, we’ve implemented SQL Clone so that developers can deploy that database to their test systems very quickly. They no longer need to copy and restore the sanitized backup manually every time they want to run tests with the latest version of the database.
How does the new
The first step is the same, and SQL Backup continues to do what it has always done which is to take a nightly copy of the database. After that, the data sanitization and delivery process is now fully automated, and consists of a PowerShell script that runs every day, as a scheduled task.
The script restores the latest backup and then uses SQL Data Generator (SDG) for the sanitization step. SDG generates dummy data for any columns that contain personal data, and copies every other column, as is, from the restored database. As part of the script, we restrict database access to members of specific Active Directory groups, then set the database to read-only, take a full database backup, and finally copy the file to the shared drive where the developers can access it.
In a separate automated task, which runs every morning, we have a PowerShell script that uses SQL Clone to create an image of the sanitized database backup. From this image, developers can self-serve a clone as often as they like. It takes only a few seconds each time, and takes up very little disk space on their systems. The only prerequisites are that they’ve installed the SQL Clone agent on their machine, and are a member of an AD group that has access to the database.
Do you have any plans to
refine the process further?
We plan to investigate using the new SQL Clone permissions feature to refine access levels for SQL Clone. As I mentioned, we restrict access based on AD groups, but in SQL Clone we currently have permissions set to all or nothing, so you’re either an admin or you don’t have access. The new permissions feature means we can give admin access, access to create images, or just access to create clones.
We may also investigate further uses for SQL Clone. For example, certain nodes within our SQL cluster are used for reporting. Potentially, we could offload reporting to Clone.
How would you sum up the main
benefits of the new process?
From my point of view, I have one less thing to worry about! Our provisioning process is now fast and reliable, and we can pretty much guarantee that no customer data will end up in the wrong hands. When auditors come knocking, I’m confident it will stand up to inspection.
Also, my only DBA is no longer continually distracted by ad hoc requests for database copies. He can focus on planned projects, like working out how to automate some of our other maintenance activities. The process uses standard Redgate tools so other people in the team can follow the process as well, without it having to be the DBA.
As importantly, the developers are a lot happier. They don’t need to ask Simon for the latest version because it’s just there, and with SQL Clone, deploying copies of the database to their servers is much quicker.
Find out more about how SQL Clone makes database provisioning easy with a 14-day fully functional free trial.
Also in Product learning
It would be wrong to portray SQL Change Automation (SCA) as being suitable only for epic project deployments, of the sort described in my previous article. It can do smaller tasks as well. To demonstr...
Also in Hub
The SQL query that is used to produce the result that is inserted into the permanent table has its order specified by an ORDER BY statement. Relational tables are not ordered, so the ORDER BY is meani...
Also in SQL Clone
Sometimes developer teams need access to a copy of the database containing live data. However, if that database contains sensitive or personal data, then it cannot be used for testing and development ...
Also about database provisioning
Developers, when working on databases rather than the application code, often find they have less freedom to experiment than they are used to. It's not necessarily easy to set up a database for testin...
Also about SQL Clone
The GDPR, and other regulations, requires that we be careful in how we handle sensitive data. One of the easiest ways to avoid a data breach incident, and any accompanying fine, is to limit the sensit...