One can never just shrug and say that the security or stewardship of information is an IT problem. The management of the way that any organization handles information isn’t just a problem of technology, but of all the processes and activities of the organization. It affects the way that the organisation operates. There is no point having secure databases, for example, if organizations leave personal medical data unsecured in an empty building, lose unencrypted DVDs containing police interviews, identify individuals in bulk emails, leave sensitive documents on public transport, or allow members of the organization to access, or even give out personal information freely. Responsible data stewardship has to be an objective for every member of an organisation.
The securing of data is an important part of an organization’s journey to full GDPR compliance, but it is just a part, and unless it is accompanied by policies and actions across the entire organisation, it is like locking the front door of a house while leaving all the windows open.
Why Categorize Data?
One’s first reaction when reading about the sudden flurry of new legislation about data privacy is to aim to treat all data as being critical in its confidentiality and integrity: This is a bad idea. It would be too costly, would require much larger teams of IT staff and would gridlock any organisation from rapid decision-taking or creative thought. Data is the life-blood of any organisation, and it must be easy for members of the organisation to get what they need when they request it.
To provide the same standards to all data systems is a hopeless aim, especially when faced with the pressures on organizations to deliver new information systems rapidly. It is much more effective to use resources to provide a range of data management standards and ring-fence data that requires high standards of confidentiality, integrity and availability.
To do this, you need to understand up-front what data is being held by the organisation, where and how, and what roles in the organisation need to access that information. Then you can provide appropriate management of that data. The art is to resist giving any of these roles too much data, or any data that they don’t need.
There are many ways of going about this task, but we’ll just tackle the minimum requirements.
In my experience, the larger enterprises have, for a long time, taken the security of data seriously. This generally takes a layered approach, where highly sensitive information is held in highly-secure data centres, and less sensitive information is held under a progressively less rigorous regime, all the way down to an Excel-based spreadsheet on a PC that relies only on Windows Security.
There are increasing problems now with this approach. Although many of the good practices of classic IT departments are still valid, the problem with many of these regimes was that they were created at a time where data was copied with great difficulty and in ways that were hard to conceal. Data could either be accessed in its entirety or not at all. The arrival of the internet was as fundamental to data centres as was the arrival of gunpowder to the usefulness of stone castles. Pocket computers, thumb-drives, fast networks and Wi-Fi have made security vastly more complicated.
There are, in my experience, two aspects that are often missed. As well as determining the type, and level, of risk associated with each type of data, we need to be clear about who needs to access the information, and in what form. Most of the internal data breaches have resulted from employees being given too much access for their role. It is no longer sufficient just to categorize the data in isolation: it must be done in its dynamics, such as the form it should take, how roles within the organizations are supposed to work with it, and the organizational processes that require it. Very often, roles are given access to individual records or entire subsets of tables when all they really require are aggregations from the data. For example, a recent data breach happened when an internal auditor dumped employee payroll data on the dark side of the web. Everyone, even auditors and Database Administrators need limits on what data they can access.
The Classic Layered Approach to Data Classification
The objective of the classic models of data security classification has been to ensure that IT production staff can provide the correct regime for any system. The organization’s Computer Manual would normally specify what was required for each categorization. Any data, such as a list of customers, suppliers, sales, purchases, or payroll would be given a level for three core security objectives. These originally focused on the process of protecting the intellectual property of an organisation rather than the privacy of individuals, but more recent use has emphasized personal privacy.
Confidentiality-preventing the unauthorized or ill-considered disclosure of data, protecting personal privacy and proprietary information
Integrity– preserving the authenticity of data, and guarding against improper information modification or destruction
Availability– ensuring timely and reliable access and use of information.
Depending on the organisation, there may be others such as authenticity, accountability, non-repudiation and reliability.
The importance of applying any of these security objectives are normally considered to be
None if it causes no financial loss, damage to the assets finances, mission, safety, or reputation of the organisation or any harm to privacy of individuals; if the data is intended for public disclosure.
Low if it is merely the data associated with work in progress that would cause only minor financial loss, minor damage to the assets of the organisation or minor harm to privacy.
Moderate if a failure of security could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals
High if a security breach could have a severe or catastrophic adverse effect on organizational operations, organizational assets, or individuals. This would include highly sensitive information that must be kept confidential as a matter of contractual obligation, law, or regulation. It applies where the organisation is legally obliged to notify individuals of any breach in confidentiality. It applies where disclosure of the data would have a severe adverse impact on the organization’s mission, safety, finances, or reputation or that of the people interacting with the organisation.
Basically, every data entity should be rated for each security objective from high importance to none.
You will, for example, probably rate as High-risk Confidentiality any data that is marked ‘confidential’ by the organisation or which is covered by federal or national law, such as data containing personally identifiable data, medical information, authentication identifiers, data covered by an NDA or confidentiality agreement, export-controlled materials, tax information, payment card information, Controlled Technical Information and private data (physical, medical, physiological, genetic, ethnic, biometric, social, economic, or cultural).
It is possible for some data to be of High-Risk Integrity while not being normally considered high-risk for confidentiality, such as logs, or other materials that need to be highly-protected against manipulation, that might be submitted as evidence in a criminal or civil prosecution.
A database inherits the highest rating of all the entities it contains for each category. If, for example, just one table in an RDBMS has a high rating for confidentiality, then the entire database has to be handled within the regime defined by the organisation for that category of data. This means that if you have Confidentiality/high-risk personal information in just one table of a database, then the whole database inherits the need for a Confidentiality/high-risk regime for its development, maintenance and operational requirements. If it provides just one service that is Availability/High-risk then the whole operational system needs a high-availability regime.
This will seem obvious, but it means that many of the applications of an organisation disseminate high-risk data throughout all their applications, especially if these are based on areas of organizational activities, such as design, manufacturing, sales, and so on. This means inevitably that all databases end up with expensive and restrictive security regimes, even though most of the data is entirely innocuous with no security objectives. Even worse is the practice of ‘gross denormalization’ where records are duplicated across systems so that a deletion or amendment to a record becomes a computer-science nightmare. In this case every system needs expensive and time-consuming security and the risks are multiplied.
Protecting confidentiality requires you to know when an attempt is taking place to make an unauthorized access of your data, preferably with as a good-as-possible a means of identifying the attacker, as well as having defenses in depth. By analogy, you need the alarms and closed-circuit televisions as well as the walls. Just make sure there is barbed wire: You must make it as awkward as possible for the attacker to get access (e.g. encryption and pseudonymization). In security, you have to assume that all levels above the one you are securing are breachable. Otherwise a trivial error such as a security patch not being applied will expose to the attacker a gloriously vulnerable landscape.
Having done all that, it pays to tie down exactly what users within the organisation require the data for. If the Business analysts, for example, require tables of information so that they can aggregate the data in various ways, then it is fairly certain that they only need access to the basic aggregation beyond the level of the individual data records, and, if not, can alternatively use pseudonymized data so that it is extremely difficult to identify individuals. It is very rare for any role to require large lists of customer records simultaneously, yet many organizations use applications that download far more information, and in more detail, than is required.
A regime for high-risk confidential information will include a firewall, intrusion-detection system, collection of forensics, auditing, and alerts for unusual patterns of data access. It will minimize the attack surface. It will log all failed authentication attempts, limit attempts, and ensure that credentials cannot be guessed within the limits. It will monitor signs of attempts at SQL Injection such as SQL Errors (usually probing guesses at table names) and attempts to access database objects without the correct access. It will check for attempts at privilege escalation and all other DCL (Data Control Language) actions. In a secure database, only the administrative role has access to the base tables, and all users can access the information via functions or procedures. In SQL Server, this will be via least privilege, role-based schemas and ownership-chaining. The ‘surface-area’ for attack is minimized.
It is wrong to consider data encryption to be a security method because its value lies more in making some methods of illicit data access too much of a nuisance. Two-factor authentication, Transaction Authorization, Biometric verification, security tokens, key fobs and soft tokens are likely to slow down or stop an attacker but, like Data Encryption, it is not by itself sufficient to ensure confidentiality. Extra measures might be taken in the case of extremely sensitive documents, precautions such as storing only on air gapped computers, disconnected archival storage devices or, for highly sensitive information, just in hard copy.
Ensuring Data Integrity
Data integrity boils down both physical integrity and logical integrity.
Problems of physical integrity are most likely when data is in transit or within an ETL process. They can be due to a wide range of events such as faulty data entry, electromagnetic pulse (EMP), extreme temperatures, sabotage, or server crash. Data transfer must be done using error detecting algorithms (error-correcting codes) which might include checksums, Hash functions or even cryptographic checksums in order to be certain of physical integrity. Backups or redundancies must be available to restore the affected data to its correct state. Some form of Master data management is a good insurance against this,
Logical data integrity can be maintained at a number of levels. Database-level constraints provide a data-level protection for integrity problems. Error-detecting strategies such as checksums will be likely to detect changes in data while it is in transit, and audit trails must be kept to alert for unauthorized changes. The most obvious precautions for ensuring the integrity of data is to restrict write access to the very few roles within the organisation that need it, via file permissions and user access controls. Version control can sometimes be used to prevent erroneous changes or accidental deletion by authorized users. Data integrity becomes a key issue when data has to be submitted as forensically-sound evidence.
Availability is a separate topic as it is so complex. It involves issues of maintenance, testing, applying patches, performing upgrades and generally maintaining a correctly functioning server and network environment without any software conflicts. It also requires a pessimistic attitude, providing redundancy and failover, maintaining adequate storage and communication bandwidth alongside fast and adaptive disaster recovery. There should be an effective disaster recovery plan (DRP) that is implemented and checked via practices and rehearsals. There must be effective ways of avoiding downtime due to malicious cyber-attack such as Denial of Service or network intrusions.
If an organisation has no central documented overview of the data it holds and processes, it is highly vulnerable to fail in its stewardship of data, with results that are likely to cause severe damage to that organisation. To protect anything, you have to know where it is, and who needs to use it. With data, you have to know at least its relative importance in terms of its confidentiality, integrity and accessibility. You also need to know why it is retained and how it is used within the organisation and by which role. With this information, you will then have a much clearer idea of the requirements for that data, sufficient to change the organizational workflows and applications to minimize the risks to that data.
If your organisation is ever caught up in a data breach or other incident that might affect its reputation or even result in legal action, the exercise of at least having taken information security seriously will provide mitigation for the organisation. Any organisation that takes its stewardship of data seriously and responsibly will take the next step and ensure that all data is held in an appropriate regime that will protect it from malice, disaster, conflict and human failings. They might even save on resources by reorganizing organizational data according to risk rather than by department or activity.
- Bayswater Medical Centre ( ICO)
- Crown Prosecution Service (ICO)
- Gloucestershire Police (ICO)
- The Morrisons data breach and GDPR compliance
- What is confidentiality, integrity, and availability (CIA triad)?
- Information security (Wikipedia)
- Electronic authentication (Wikipedia)
- Non-disclosure agreement (Wikipedia)
- Export Controlled Information
- Tax Information Privacy and Confidential Tax Guidelines
- Defense in depth (computing) (Wikipedia)
- The Equifax Breach Was Entirely Preventable (WIRED)
- Attack Surface Analysis Cheat Sheet (OWASP)
- SQL Injection Prevention Cheat Sheet (OWASP)
- Transaction Authorization Cheat Sheet (OWASP)
- What is an Air Gapped Computer? How secure is one?
- Data integrity (Wikipedia)
- Forward error correction (Wikipedia)
- Master data management (Wikipedia)
- Planning for Disaster – (Simple Talk)
- Disaster Recovery Planning for Data: The Cribsheet (Simple Talk)
- Best Practices for Preventing DoS/Denial of Service Attacks