A data discovery and classification research project from Foundry


In Foundry, we’re responsible for developing new products and technology to support the changing needs of our customers. We’ve seen a huge shift in our customers’ needs: driven by new and constantly evolving regulations, there’s an ever-growing demand on their time for data governance tasks.

As a result, we’re developing prototype software in SQL Data Mask (to de-sensitize databases) and SQL Census (to help you explain your user access permissions to auditors).

Both of these projects got us thinking; how do you know what’s sensitive enough to need masking or restricted user access?


What is data classification?

Enter ‘data classification’, the process of fully understanding your data so you can protect it accordingly. This process is often considered foundational, given its impact on higher level projects (eg. obscuring sensitive data and reporting on user access permissions).

1. Discovery and classification

The generally accepted process starts with the discovery and classification of data. Classification is dependent on understanding the content, context and the users of the data. The classification categories will largely depend on the data you hold and the risk to your organization that it carries.

2. Definition of responsibilities

Once data has been classified, the process continues with the definition of responsibilities. At this stage, you’re adding metadata to the information that’s held within the database. It will be important later on to understand who created the data, who owns it, who’s using it and who’s responsible for it at audit time.

3. Addressing the data

The final step is to actually address the data, making decisions about what needs to be done with it.

 

Why go to the effort of classifying data?

On the face of it, a data classification project might seem comparable to spending the weekend ordering the books in your bookcase by their color (time I don’t regret spending), but it is a foundational project, and one which is likely to have a positive impact on the work that follows; becoming compliant with regulation, delivering reports during an audit, moving data around or just understanding your organization’s exposure to risk through the data it holds.      

These are the next steps we’ve already begun to cover with SQL Data Mask and SQL Census, but the implications of having gained a better understanding of the data your organization holds may be even further reaching for you.

So what’s the problem?

Running a data classification project may sound simple in theory, but we think that it’s easier said than done for SQL Server. We’re preparing to run our own data classification project and so far we’ve only managed to find tools and support for documents, file systems and emails. The support for data classification with SQL Server appears scant.

Finally, it’s likely that the classification project run will only cover a snapshot in time. New development projects will come and go and, over time, there’s no doubt that you’ll need to solve the problem again for any new data that appears. It seems sensible that a proactive approach would be better – classifying new data as changes occur. Building a proactive solution is difficult given the complexity of existing processes.

Tell us about your experience

Like every Foundry research project, we rely on your expertise and experience to help us understand the problem and begin to build a solution. Are you about to begin your own classification project? Have you recently completed one? We’d like to hear about it.  

We’ll always endeavour to compensate you for your time; be it with an Amazon Gift Card, paying particular attention to your requirements or even a free license of the software that’s developed as a result, so please sign up to participate in this data classification research project.

Sign up to the data classification research project

Share this post.

Share on FacebookShare on Google+Share on LinkedInTweet about this on Twitter

Related posts

Also in Blog

You’re not delivering DevOps to the database

I’ve read through a number of the industry thought leaders to get an understanding of how DevOps is being communicated out there. As with so much else in life, you can start at Wikipedia to get a ge...

Also in Software development

Simplifying the user access audit with SQL Census

SQL Census is the latest piece of technology to be developed within Redgate Foundry. Still early in its development, it’s a product that’s designed to help you explain to an auditor which of your ...

Also about Foundry

SQL Data Mask: masking configurations and reports

SQL Data Mask is the latest prototype to come out of the Foundry, Redgate’s research and development division. It copies your database while anonymizing personal data. You can use it to mask your da...