Data Retention: An Inexact Art

A myriad of laws regulate data retention and generally compete with each other. William Brewer provides an introduction to the principles, but gives good advice -- leave the details to the specialists.

There are several reasons for understanding and categorizing your organizations data. While security and privacy are important, so is data retention, once known as document retention.

Data retention can’t keep up with the technology. Once, when I was the chief data guy for a telecommunications company, I checked the legislation that dictated the length of time that one should retain logs of phone calls. Privacy legislation mandated that they be deleted once the users had accepted and paid their invoices. National government requirements covering the countries we operated in varied considerably, but generally mandated that they be held for five years. Or own legal team wanted a blanket retention period of six years to cover potential litigation. We had agreements with some professional bodies and trades unions that specified a maximum of two years. Occasionally we had requests from the legal representatives of our users for evidence from logs that could prove that they were in a certain place at a certain time, on one occasion going back longer than five years. We also had to keep traffic data for five years for reasons of national security. These competing requirements were hard to reconcile.

The requirements of businesses, government, professional conduct rules or the law dictate that we should retain data, sometimes permanently. Your organization may be required through government legislation to retain data in case it is needed for evidence by the government, international agreement, the tax offices, the medical services, health and safety, security services, a court of law, or the police. Your business will have its own reasons for keeping records, and some of these go back over a hundred years. Historians, genealogists, biographers and sociologists are grateful for our cautious habit of retaining information about people and the things they to.

Laws take a long time to derive, evolve and be accepted. Fortunately, society has, until recently, managed to keep pace. This changed with the arrival of information technology. The law now struggles to keep up with our capabilities. This is particularly so with the topic of data retention.

Until recently, the topic of data retention could be best described as dull. It was the arrival of ‘Big Data’ that changed things because of the potential for abuse that is inherent in traffic analysis and mass surveillance. By analysing the retained data from telecommunications and the internet, governments can identify the locations of individuals, their associates and the agents of political or social unrest. Whether the political activities, or the surveillance of them, are lawful depends on the constitutions and laws of each country or federation. Whatever the law, there is a moral dilemma.

What is Data Retention?

Data retention dictates how you deal with all the records of the work and transactions of your organization in order to meet all the many legal and business requirements. It covers not just the organization’s databases but the full gamut of documents. Nowadays, documents are still documents however they are held, whether in emails, files, paper, databases, or social media. There are two competing pressures for the retention of documents. On one hand there is the pressure from the requirement of personal privacy, and on the other hand the need to be able to prepare for any eventuality that would require you to refer through the records. It is difficult to predict the eventualities, but it can include gathering facts or evidence that are categorized as medical, HR, pension, properties, fire/flood, exposure to hazards in the workplace, or administration.

Data retention has always existed in commerce, trade, banking, industry and the professions. It is ingrained into the way that we live. Detailed records still exist, for example, for the manufacture and ownership of steam traction engines sold in the 1920s in sufficient detail to rebuild one from scratch or to trace the descendants of its owners. The earliest writings, preserved for thousands of years, were about audit, trade, and commerce. We are meticulous in our retention of data. What has changed now is our ability to retain irrelevant personal data in order to improve surveillance or to target existing or potential customers. It is here that we get into a gray area where the morality is unclear, and the legal aspects are trying to catch up.

Data Retention Principles

As you may imagine, the issues are complex and slightly paradoxical. The requirements for data retention get more complicated for multinational companies, or organizations operating in more than one jurisdiction. The most practical strategy is to make use of existing frameworks for document and data retention

Establishing Your Data Retention Policy

The average manager in IT cannot even attempt to understand all the laws that govern real practice in what they do. Instead, the organization is wise to adopt a legislative framework that reconciles all the various laws and industry practices, along with all the commitments to self-regulation that the trade or industry has agreed. Any employee in the organization just adopts the framework and lets the specialists worry about the details. In the UK, for example, the various layers of government, higher education and the National Health Service issue legislative frameworks that provide summary advice on document retention that is constantly updated. The frameworks or policies that deal with data retention balance the many legal and privacy obligations against economics and business requirements to determine the retention time, the rules for archiving data, data formats, and the way that data is encrypted, stored and accessed.

It is useful to create a local IT data retention policy that covers all the IT issues that are extracted from the legislative framework adopted by your organization. You can then assess your compliance against the policy and adjust it as necessary. This should include standard retention periods for specific categories of documents. As I’ve already discussed and illustrated, some categories of documents will need to be retained for longer than others. How long you retain different categories of personal data should be based on the purpose and business needs.

Review the Data that You Have

In parallel with the work of creating a clear local policy for data retention, you need to review the data itself to be clear on how that policy must be applied, and what the policy must cover. This review of the data must be able to answer the following questions.

  • What is the nature of the data that is held?
  • Is current and archived data being stored sufficiently securely?
  • Where does it reside? (including archives, backups and deleted data)
  • How long it is kept?
  • Why is it retained?
  • How and when is it archived or deleted?
  • Where are the various categories of data such as personal data, manufacturing data or product development data held?
  • How long will this data be useful to the activities of the organization?
  • Could data be potentially requested for litigation? If so, what is the limitation period?
  • Is this data subject to legislation that requires a retention period?
  • Is this data likely to be required for surveillance, tax investigations, or police work?
  • How is data ‘expired’?

How Long Must Data be Retained?

The answer is in a rich variety between never and permanently. It depends on the type of data and the legislative framework. It is common to have conflicting legislation, but, in practice, reason trumps dogma and a compromise is generally possible. The MoReq2010 (Model Requirements for the Management of Electronic Records) is a good, but lengthy, guideline for the management of electronic records, although without any legal backing. It is, however, a good place to start in establishing a policy and its use is most unlikely to meet criticism.

Much data is held by businesses merely in case of unexpected litigation and therefore (in the UK) can usually be disposed of after six years (Limitation Act 1980). In the USA, this varies considerably from state to state. If there is no express limitation, records and information must be retained until there is no possibility of litigation (or in cases of investigation by the Police, External or Internal Audit). Most organizations need to keep a permanent record of such things as actuarial valuation reports, assessments under health and safety regulations, records of consultations with safety representatives and committees, tax approvals, records of members of a senior management team, trade union agreements, trust deeds and rules, trustees’ minute books or works council minutes.

Public organizations also need to keep records in in the event of claims, or in case individuals exercise their right under freedom-of-information legislation to access to all but exempt or confidential data.

Until recently, all important business documents were printed and archived. This is no longer always the case, and electronic documents, normally PDF or occasionally MS Word, are now considered to be evidence if it can be demonstrated that they have not been tampered with. On top of the requirements of the organization, most companies face many laws and regulations which determine how long data should be retained. Electronic documents and databases need to be archived according to a labyrinthine collection of national and international laws.

An increasing number of international laws demand data retention, such as the many money-laundering and tax enforcement laws. As well as your own national or state laws, there are the laws of your trading partners. For example, financial institutions around the world are required by the US Foreign Account Tax Compliance Act (FATCA) to retain and report information in order to disclose “financial accounts held by U.S. taxpayers or foreign entities in which U.S. taxpayers hold a substantial ownership interest.” FATCA is infamous for its reach but, even if it did not exist, document retention and production is still required for any business conducted with US citizens, and the EU has equivalent legislation.

Generally, there are a range of Record Retention schedules (RRS) available that will be relevant to most organizations. These aim to summarize how long all the different types of data must be held. These will vary greatly according to nation, state and line of work.

Record Retention schedules will need to specify a retention time for many types of data such as

  • Intellectual property, patents and trademarks
  • Contracts, deeds, agreements, judgements and adjudications
  • Employee, and customer information such as personnel records for staff
  • Information about members of the public such as marketing information or medical records for patients
  • Financial information, invoices, purchase orders, bank statements and monetary transactions
  • Correspondence, reports and advice given and received
  • Memorandum, operating instructions, executive decisions, policies and procedures
  • Agendas, minutes, decisions, actions and outcomes of meetings
  • Registrations, appointments and announcements
  • Any other necessary documents, communications and information received by the organisation, or generated by the organisation, in the course of conducting its business

A typical Record Retention Schedule for an organisation will categorize the type of document according to the type of legislation that dictated the document retention. It will cover the data concerning governance/corporate management, finance, Personnel/HR, Physical Resources (e.g. Estate management, equipment, health and safety, IT infrastructure), information services (intellectual property, audits), business activities, external communications and marketing. Document retention periods will from zero to eternity and will be dictated by a wide range of law and practice. For example, the retention of records for training of staff in on-premises health and safety in the UK is currently dictated by seventeen separate regulations.

Data as Discoverable Evidence

Discovery in legal proceedings such as litigation, government investigations, or Freedom of Information Act requests, is subject to rules of civil procedure and agreed-upon processes, often involving review for privilege and relevance before data are turned over to the requesting party. In the United States, the discovery of digitally-based information was the subject of amendments to the Federal Rules of Civil Procedure (FRCP), 2006- 2015, and in the England and Wales by Part 31 of the Civil Procedure Rules. Organizational data, including emails and documents are often requested as evidence in litigation. It isn’t just about paper files. The way you store your information is irrelevant. It will be in a number of places: It could be held in its databases, on the filesystem of the network, on individuals’ computers, external media or on backup tapes. Employees who work from home are likely to have discoverable information about company business on laptop computers, home computers, personal e-mail accounts and phones. That information is also discoverable. If you have information that is relevant to litigation, whatever the form, it will be discoverable.

Any organization can find itself to be a litigant and they have a duty not to destroy evidence that might be discoverable in a lawsuit. This is called the ‘duty to preserve’. Even a delay in providing the emails, documents or database records can lead to a retrial. Even If an organization knows that it is likely to be a litigant, though no discovery request has been made, it has a duty not to delete any data that could conceivably be requested as evidence by either plaintiff or defendant. Nowadays, this includes more than just paper records. Information stored in your ‘personal productivity tools’, such as e-mails, electronic calendars and drafts of documents on the network are not only regarded as evidence but are routinely requested. As the type of litigation is often difficult to predict, this means retaining emails in an archive for disposal after six years (Limitation Act 1980). Courts take a poor view of a litigant claiming that all emails on the subject of the litigation had been deleted. Emails can also contain financial or personnel information that are likely to require to be retained for different periods and under a different regime to ordinary emails. There are a number of business records that should be held for up to fifty years. However, these are unlikely to have their primary source in an email.

What Needs to Be Preserved?

If you or your organization gets involved in a dispute, you will need to be able to access all the information that is relevant or even arguably relevant to your dispute and be able to prove that the data has not been tampered with, usually by means of the metadata that is not found in paper documents and that can play an important part as evidence. To start with, you must not erase documents and data that are part of business records, even if you would normally conduct periodic purges of the data: Instead, this must be archived until all obvious risks of litigation are over. Emails as originally sent or received must be archived.

Data needs to be in just one form, as long as it is possible to prove that it hasn’t been tampered with. Hard copies of electronic data are often preferred by courts, and are often convenient, but electronic forms are sufficient if there is an audit trail that proves the integrity of the data.

‘Big Data’ and Surveillance

The GDPR builds on the EU Data Protection Directive that states that any information that relates to an ‘identified or identifiable natural person’ must be deleted (this includes any backed up information) unless national law allows for such retention. Personal data can be retained for a longer period (which is an additional period of five years) if good reasons are provided for under local legislation. Otherwise, the personal data must be deleted. Amongst all the good reasons for holding data, there is police surveillance, tax investigations, the detection of international money-laundering, and government security work. For this, The Data Retention Directive of the European Union focused only on data about telecommunications or the internet, and data about tracking communications. This directive specified that data was required to be held securely for at least six months and, at the most, 24 months. No access to the data could be made without the permission of a court. Unfortunately, this directive fell afoul of the Court of Justice of the European Union and court cases brought against some national governments. The judgements decided that imposing a general obligation to retain data violated the EU Charter of Fundamental Rights, in particular the right of privacy. However, it accepted that retention is compatible with EU law if access was conditional, and if deployed against specific targets to fight serious crime. Also, the court decided that telecommunication and internet records would constitute personal data where the individual concerned can be identified. The current position remains complicated, and the laws of EU members vary slightly. Great Britain passed the Data Retention and Investigatory Powers Act 2014 to try to deal with these concerns.

There is a view amongst security experts that a two-year retention is inadequate. One can argue that even ten years isn’t enough because terrorist cells and groups often remain dormant for many years before they act. From this, one might conclude that member states of the EU should be permitted to expand the data retention period limit. However, the data retention period cannot be extended as EU member states have signed up to the Charter of Fundamental Rights of the EU which provides the right to privacy of EU citizens. Also, individual European nations with long-standing terrorist problems cannot plead an extended limit because this would cause a disparity in the laws within the EU that could, theoretically, affect the free movement of goods and services within the EU.

For detecting and tracing Money Laundering, The Fourth European Anti-Money Laundering Directive (AML) requires that personal data be deleted after the expiry of a minimum retention period of five years. A further retention may only be granted, if necessary, for prevention, detection or investigation of money laundering or terrorist financing.


IT professionals must be increasingly aware about data retention and how this fits into many production systems such as Email archiving and organizational file systems. It affects also the data lifecycle and the database lifecycle. It impinges on topics such as High Availability, secure archiving and resilience. Data professionals need to be acutely aware of their responsibilities to the organization they work for in curating and preserving their organization’s data in line with the many competing requirements on anyone who holds, processes, or publishes data. It isn’t the most glamorous or exciting job role in IT, but it is one of the most important. The consequences for any organization of failing to manage the lifecycle of their data properly can be catastrophic for that organization. Imagine being fined for mismanaging personal data, losing legal actions through failing to allow data discovery, or being penalized by the tax authority for failing to be able to prove your tax declarations. What about the prospect of a draconian fine for failing to comply with money-laundering legislation? What about losing valuable patents through failing to provide the documentation or accidentally ‘end-of-lifeing’ the documents that can clearly disprove accusations of financial mismanagement for a company? The retention, archiving and retrieval of Organisational Data can be a serious issue for the health of the organization.