When is the Data Deleted?

Imagine that your business is providing a service to individuals, and you charge by the amount of usage. You are trading your service internationally and need to keep a record of who among your customers does what. You then produce invoices and keep accounts. Your customers pay via a third-party service. So far so good.

The government contact you, and specify that anti-terrorism legislation requires you to retain records for, say, seven years. However, your German branch insists that they’ve been told it must be a ten-year retention period. Your lawyers say that your users will soon have the right to request that you delete their personal information, presumably once invoices for that period are paid. Then you are told that these records may be required as evidence in criminal prosecutions, so there is a minimum required ‘document retention’ period, in case you are subject to a request for evidence in a corporate fraud prosecution. Your German branch are back on the line saying that their privacy legislation mandates that all records need to be deleted within three years.

Assuming we can make sense of the cacophony of mutually contradictory edicts from government, police, QUANGOs, international bodies, federal organizations, and trade bodies, how easy is it to delete a record, on request, or after a statutory time-interval?

In less complicated times, we deleted information by flagging it as ‘deleted’. Nothing got physically deleted. If no-one could access it, then it was ‘deleted’. We are cleverer now: we actually delete the data. Or do we? When attempting to do so, it’s very easy to leave information by accident at the end of data pages. Or we forget about the data that’s in the long snails-trail of data backups and log backups. We have auditing software that we are legally required to use, which also has all this data in it. It must be on a different system. That gets backed up too. Maybe it is also in accidentally-persisted caches. Any good ‘data forensics’ person will tell you that data leaks everywhere.

Fine; then we also need to delete the data from the backup file. Have you ever tried that? With any respectable database system that is appropriate for commercial use there is a checksum both in the data file and the backup that uses an algorithm that we are not privy to, for obvious reasons. You must not be able to alter data in even the backup of a database because this could potentially destroy evidence of fraud or malpractice.

So how do we deal with this? We technologists take everything literally, which is why we are good at it. However, the legislators don’t share our meaning of the word ‘delete’, so we should think creatively. If, for example, someone tells me I need to delete certain records after three years, and someone else that I need to retain them for ten, then what do I do? After three years, I put the disks (including backups, audit files, the lot) in a remote secure storage that would require a magistrate’s signature to access. In a very uncertain world of electronic trade, we do the best we can: we encrypt stuff or lock it in concrete bunkers. No authority is concerned with trivial infringement. They have bigger fish to fry.

Commentary Competition

Enjoyed the topic? Have a relevant anecdote? Disagree with the author? Leave your two cents on this post in the comments below, and our favourite response will win a $50 Amazon gift card. The competition closes two weeks from the date of publication, and the winner will be announced in the next Simple Talk newsletter.