The conflict between data protection and DevOps

By Owen Moore - https://www.flickr.com/photos/132053576@N03/17765606909/, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=41177541Data breaches are the new normal – according to the Identity Theft Resource Center there were nearly 1,600 of them in 2017 in the US alone, exposing 179 million records. Demonstrating the scale of the issue on the other side of the Atlantic, UK retailer Dixons Carphone admitted in June 2018 that it had suffered a hacking attack involving 5.9 million payment cards and 1.2 million personal records.

No wonder companies want to keep data confidential and protected, particularly in a world where they can be fined 4% of their turnover for non-compliance under GDPR rules which protect the personal data of European citizens. And one where data privacy legislation is popping up everywhere. Sarbanes-Oxley (SOX) and HIPAA are being joined by California’s Consumer Privacy Act on January 1, 2020, and it’s likely to be the first of many new regulations in the States.

At the same time, however, the increasing pace of business means developers need to create and release code in much shorter DevOps timescales. And that includes the database too because Redgate’s 2018 State of Database DevOps Survey revealed that 76% of developers are now responsible for developing databases as well as applications.

This is where the snag arises because, in order to develop and test database code, developers need access to ‘real’ data. Redgate’s survey also showed that 67% of developers use production data in development and test environments, usually in the form of a copy of the production database.

So on the one side you have Database Administrators looking to protect data and account for every record, aiming for anonymity and confidentiality, and on the other you have developers needing up-to-date, production-scale data to properly test changes.

An additional hurdle is the time and disk space eaten up provisioning copies of databases or backups for use in development. This can act as a blocker, either slowing down releases or forcing companies to use out-of-date copies which are no longer representative of the production database.

The need for compliant provisioning

How do you solve this conflict? The database can’t be the bottleneck in DevOps, yet data must be protected. Organizations therefore need to adopt a four-stage process to keep everyone happy and development and test on track.

1 Provision copies

Duplicating a full database to give a copy to developers is time-consuming and takes up lots of storage space, particularly given the growing size of production databases and the desire of many developers to have their own copy as a sandbox to test changes against.

Solutions like Redgate’s SQL Clone (the virtualisation technology available within SQL Provision) have emerged, however, which clone databases and enable copies to be provided to developers which are a fraction of the size of the original database. Importantly, they work just like normal databases and can be connected to and edited using any program, so developers don’t even notice the difference when using them.

Solutions like this smooth out the provisioning process and speed up the DevOps workflow for both DBAs and developers.

2 Protect the data

The faster and more efficient provisioning of copies provides part of the answer, but it doesn’t remove the need to preserve the anonymity and confidentiality of customer data. That’s where data masking comes in, replacing sensitive data like personally-identifiable information with fictional data that retains the look and feel of the original.

This means the database remains perfectly usable for development and still operates in the same way as the real data for testing and analysis, but the content is secure, helping to meet regulations such as GDPR, HIPAA and SOX.

As well as development and testing, this offers valuable advantages across a wide range of business areas including business analysis and cloud adoption – essentially any situation where there’s a privacy or security risk associated with using real, identifiable data. Companies should look for a solution like Redgate’s SQL Provision which includes Data Masker, this enables them to customize the rules which control which data is masked at scale, ensuring company-wide standards and compliance.

3 Automate the process

Different developers and testers will want access to data at different times, probably when DBAs are at their busiest. Therefore, look at self-service tools that automate the process and make it easy for developers to receive copies of the database with information safely masked, when and where they need them. This frees up DBA time to focus on more business-critical activities.

4 Manage access

One of the key requirements of regulations like the GDPR is that companies need to be able to document their processes and provide full audit trails of who had access to which data. Ensure you are able to create a central record of all your database copies, so that you can track this, and look at role-based provisioning to prevent unauthorized access to sensitive information.

Summary

The database needs to evolve faster for DevOps to achieve its full potential – it cannot be the bottleneck. At the same time, companies know the importance of protecting confidential information, and the consequences of failing to do so. Only by adopting a combination of efficient provisioning and data masking will they be able to balance these two demands, remove conflict and speed up development, test and release across their business.

This article was originally published on DevOps Online on 6 July 2018.

SQL Clone and Data Masker for SQL Server are capabilities of SQL Provision. Find out more about SQL Provision and grab your 14-day fully functional free trial.