Questions about Data Governance that You Were Too Shy to Ask

A company’s data is one of its most valuable and important resources. Managing and protecting that data are big responsibilities, and a data governance processes must be put into place to avoid misuse and to meet regulations. In this article, William Brewer answers questions you may have about data governance but were too shy to ask.

  1. Why does an organization need to have Data Governance?
  2. Does Data Governance restrict Agile Application Development?
  3. If my organization hasn’t got any data governance, where does it start?
  4. What is involved in Data Governance?
  5. Why is data governance suddenly becoming important?
  6. How should we be establishing data governance within the organization?
  7. Who is responsible for Data Governance in an organization?

1. Why does an organization need to have Data Governance.

This isn’t a sudden new requirement. Data governance has been practiced, and published about, since the nineteen sixties. The enthusiasm that CIOs give to it has fluctuated over the decades, but it can no longer be considered optional now that legislation is catching up with our ability to process data. Not only is it now easier to use data, but also to abuse it.

Any organization is legally and morally required to make sure its data is correct, that the organization understands and meets its obligations as a holder of data, and that the organization is entitled to use it. To do this, there must be a clear responsibility and authority for data-related matters that provides planning, monitoring, and enforcement over the proper management of data assets. It is a task that needs tact, flexibility and persuasion.

Legislation becomes more demanding, but there is a common thread in the demands, whether it be in accounting, audit, security, resilience, privacy or retention. You need to know your data, its characteristics, the responsibilities that go with it, where it is held, how it can be altered, who owns it, who requires it and so on. Generally, If the directors of a company face prosecution, the excuse of an ignorance of the company’s data is no longer accepted in mitigation.

The frustration felt by anyone tasked with the management of data is that the up-front effort, if done properly, makes IT initiatives quicker and cheaper to introduce and avoids the expensive disasters that lead on from financial misstatement, decisions made on faulty or misrepresented data, breaches of sensitive or private data, or poor data quality for key decisions.  Most organizations who achieve good standards of data governance report that it has led to quicker delivery and easier changes in applications, continuous improvement around financial reporting, and improved internal control of such reporting.

There are direct business benefits to having a well-defined approach to assessing, managing, using, improving, monitoring, maintaining, and protecting organizational information. By getting it right from the start, an organization can avoid the costly and time-consuming necessity of re-engineering applications to comply with the legislative framework, and industry standards. It also makes the auditing process a lot cheaper.

2. Does Data Governance restrict Agile Application Development?

Not necessarily. If both are done right, the converse is true. Data governance needs to be given the opportunity to be involved at all relevant points in development using a similar model to operations teams with DevOps. As long as it meets the company’s objectives, Data Governance can be Agile too.

The approach to Data Governance need to fit the organization’s culture, the maturity of its data management and the data governance objectives it needs to meet

Many of the problems in the past have come through developers facing the requirements of data compliance only at the point of deployment. Data Governance in some organizations have lacked the agility to revise the model in the light of business changes or a clearer understanding of existing data, and have failed to promote any insights into the value of, and need for, data governance.

It is certainly true that there have been some heated debates in the past between Agile development teams and the data governance teams within some organizations. The friction between delivery and governance in IT tends to have focused on the idea that there can be one centralized data ‘truth’, within the enterprise. Developers sometimes regard any attempt to introduce an organization-wide data model, master data set or data warehouse as being inflexible and unachievable, and the cause of harmful delays and friction. The governance teams tend to view this as the refusal by developers to accept a business consensus on the nature of the data it holds. The delays can come from trying to gain an agreed view of the nature and definition of the various types of data being used by the organization, especially if it ends up with a fragile compromise.

From the perspective of a delivery team with deadlines, it seems far better to concentrate on delivering a data model that appears to fit the application domain and then retrospectively consider how it fits into the corporate data strategy and legislative framework. In the light of the scale of modifications that are required to comply with upcoming legislation, this is becoming a reckless approach, and is hardly likely to meet the approval of the application’s business sponsors. Compliance cannot be bolted on as an afterthought: Even Sarbanes-Oxley required independent ways of auditing financial processes, and the GDPR requires an interaction with individuals that are intrinsic to the application, and so need to be considered early on in development.

It is a bad idea to impose data governance on an unwilling delivery teams. It can lead to all sorts of complications such as ‘Rogue’ data repositories, mapping systems, or an over-reliance on inflexible monolithic transactional systems. Not only can these defeat any chance of maintaining the rapid delivery of changes and enhancements to the business, but risks making subsequent necessary changes such as the deletion of personal data on request frighteningly complex.

3. If my organization hasn’t got any data governance, where does it start?

Probably the best place to start is to designate someone to be responsible for data governance. There should be a single organization-wide activity that can get processes in place and ensure that staff are well-informed about data. This needs to feed in reports at board level. How this is implemented depends on the size and style of the organization. For a startup, it could be a role among many for a single person whereas it could require a large team in a multinational enterprise: However, it must exist because a failure to govern the use of data can be catastrophic. Whether the organization calls this department the ‘data office’, ‘data protection office’ or ‘data governance office’, the message must be clear that the organization ensures that it cares about its responsibilities, and has sufficient staff and skills to discharge its obligations. By taking a reasoned viewpoint right across the organization, it can thereby prevent duplication of effort, misunderstandings and conflict. Because the system of data governance must remain effective in the face of changes in staff and management priorities, organizations need to have a defined process in place that can ensure that data is correct, consistent, of high quality, secure and legally used. This isn’t just a development concern, but is also an operational issue right through the complete data lifecycle.

There must be clear accountability for all aspects of data governance and it must be understood who in the organization is responsible for any aspect of data. An important aspect of the role is to give clarity to what the various responsibilities are. This should be an evolutionary process of cultural change for a company, altering the company’s way of thinking, and setting up the processes to handle information that be used by the entire organization.

4. What is involved in Data Governance?

It is important for the management of any organization that data is available, usable, correct and secure. There should be a single source of truth, and a common understanding and definition of what data represents. Data needs to be easily categorized according to the requirements for audit, security, retention and consent. Personal data can only be held with active consent of the person involved and with the security that they expect. Commercial financial data must be immutable, in that only authorized changes can be made, unauthorized changes detected, and all changes are audited by an independent system. This can only be done effectively by establishing processes that can ensure that the important data assets are formally managed, so that the users of the data can have confidence that it can be used for decision-making.

Any organization should know, and be able to present to its auditors, or to government agencies, the details of the kind of data that it collects and holds on its customers and employees, and what data it shares with third parties or is processed by third parties. It must know who in the organization has responsibility for this data, where it comes from, how long it is retained, whether it is shared, and what controls are in place. Because the data held by an organization can represent a significant proportion of its value, it needs the attention given to any significant corporate asset.

The organization must be aware of the impact of a breach to the organization, its financial liability, and the likely effect that such a breach would have on the subjects of personal information.

5. Why is data governance suddenly becoming important?

Our ability to collect and analyze data has not been matched by a common agreement of the legal and ethical issues of doing so. Although it has several benefits for the organization that adopts it, it has been easy in the past to assign data governance a low priority, particularly as the view has been expressed that it gets in the way of delivering functionality rapidly to the organization. However, the public have suffered a large number of very public scandals such as Enron, WorldCom, Tyco International, Ashley Madison, and Equifax. They are more aware of the sometimes-catastrophic consequences of data loss and data fraud. This has increased public support for more legislation on data, including Sarbanes-Oxley, Basel I, Basel II, HIPAA, GDPR, cGMP and for a more vigorous enforcement of such legislation. Because the law is only belatedly catching up with the use of technology in commercial and scientific organizations, there is still a cultural mismatch in many companies that can cause spectacular public data breaches and other leaks.

The introduction of the GDPR marks the first time that organizations are expected to put into place ‘comprehensive but proportionate governance measures’, and are now legally obliged to perform certain activities such as privacy impact assessments in certain circumstances. They are responsible for complying with the principles of ‘privacy by design’

6. How should we be establishing data governance within the organization?

Once a data office is in place within the organization, the obvious next step is to provide an operating model. The operating model for data governance depends on both the size and type of organization and its style of management. There are several ways of doing it and some of it can even be outsourced. Whatever style that is adopted, the Operating Model must fit the culture of the organization. Whatever the size of the organization, it must still exist, preferably within the existing management structure and processes.

There are many reasons for needing data governance initiatives such as

  • complying with regulatory law,
  • harmonizing IT operations after rapid company growth, company acquisitions or corporate mergers
  • reducing mistakes in reporting, Business intelligence and applications
  • reducing redundancy and mismatches in data between departments within the organization
  • preventing unnecessary ‘siloing’ of data within part of the organization

Here are some of the typical stages in developing a data governance activity in an enterprise.

Discovery Phase

This phase would provide a simple intelligible overall plan, assess the organization’s current maturity, identify the stakeholders, identify opportunities, business value and quick wins, and provide a data governance roadmap.

  • Developing a value statement: This explains why it is necessary. This should briefly and simply describe the scope, general overview, objectives, and the criteria for measuring success. It will probably list the benefits, such as consistency and confidence in decision making, maximizing the potential profit through using the data, enabling better planning by supervisory staff, decreasing the risk of regulatory fines and improving data security. It should also define and verify the requirements for data distribution policies. It should be careful to distinguish what aspects of data governance are required by law and what is merely industry good-practice.
  • Preparing a roadmap: This should explain how to introduce data governance and what the landmarks are. It will probably need to establish process performance baselines and a statement of Business-IT alignment. It will need to spell out the necessary technical and organizational measures to integrate data protection into processing activities. The roadmap needs to provide a guide for managers to assess the current maturity of Data governance to assist planning.

Foundation Phase

This phase seeks to define the organization’s necessary objectives for sound data governance, communicate and educate stakeholders, secure executive support, and assign data stewards.

  • Planning and budgeting This would provide the overall program plan and funding to provide data governance. It would provide the requirements for Change Management
  • Designing the program – This would deliver the details of what is required. It would provide an Operating Model that describes who can take what actions with what information, and when, under what circumstances, using what methods. It would provide a communication and training plan. It would define the key roles and responsibilities with clear lines of communications

Implementation Phase

This will vary greatly depending on the size, culture and objectives of the organization. It could range from a highly structured organization with a set of defined processes with tools and templates to an ‘Agile’ team of individuals who work together to accomplish the goals and work through the roadmap

  • Deploying the program: This will probably start by documenting your processing activities, their purpose, the extent of data sharing and data retention. The implementation will probably need to span policies, standards, processes and technology. It will need to ensure that records are easily kept up to date and reflect current processing activities. Typically, where personal data is involved, processes will be required for such things as privacy notices, records of consent and controller-processor contracts. The location of personal data will all need to be identified, along with reporting such as Data Protection Impact Assessment reports, records of personal data breaches and attempts at breaches.
  • Governing the data: Once a model or glossary of the organizations data and data lineage is in place it must be maintained by some means. The means by which this will be done will vary greatly according to the organization, and the type of model used. It may include a responsibility assignment matrix, a data maturity matrix and a data Prioritization Model as well as various types of data mapping.
  • Monitoring, Measuring and Reporting This is essential to maintain and develop the service. It may be done by a data governance practice or corporate Data Authority and might require various tools such as a Measurement dashboard. It can involve maintaining records of processing activities, monitoring service providers impact assessments, regular audits and HR policy reviews and updates. It can include training and awareness raising programs.

The scope of Implementation may range from an enterprise-wide effort to one or more pilot projects. Sometimes a data governance initiative will be from the grassroots level, and will provide a limited implementation to demonstrate value to potential sponsors in senior management.

The style of implementation can vary too. Where a more agile approach is needed, the planning stages will be required but will be used to identify those data governance initiatives that are smaller in scope but aligned with strategic projects or business needs, implement these first, and build from an initial success and experience. This allows everyone to be in touch with progress and decisions, and to appreciate the business value being realized.

7. Who is responsible for Data Governance in an organization?

There exist distinct roles within an enterprise whose objective is to improve the accuracy, accessibility, consistency, and completeness of data, and to ensure that it is legally held and retained with appropriate security and access controls. In the larger organization, this team usually consists of executive leadership, project management, line-of-business managers, and data stewards. It is likely that the team will use a pre-existing methodology for tracking enterprise data, and tools for data mapping, data profiling, data cleansing, and monitoring. These roles are often found in information about data governance.

  • Data Controllers A legal entity such as a company
  • Data Protection Officers A DPO has responsibility for an organization’s data privacy compliance. All organizations who meet the GDPR definition regarding their use of data should have a DPO, or have an external consultant to do the role. They take responsibility for data protection compliance and must have the authority to carry out their role effectively. They must inform and advise the organization and its employees about their obligations to comply with the GDPR and other data protection laws. The DPO must have regular interaction with executive management, often via a Data Governor. Under the GDPR, they are subject to conflict-of-interest rules and enjoy a degree of protected employment status
  • Data Governor – Typically an executive, answerable to the board, with broad knowledge of the issues but relying on the reports of Data Owners for details. Responsible for ensuring that enterprise-wide data initiatives and policies happen
  • Data Owner Responsible for the data that is used by the management domain, and generally a role undertaken by a senior manager. Must ensure that initiatives across the whole domain are fostered.
  • Data Stewards These have a more technical role that is responsible for promoting an understanding of data management and making sure that all aspects of the use of data conforms to the organizations policies and standards. They ensure data governance processes are followed, guidelines enforced and recommend improvements to data governance processes. The Data Steward typically spends a lot of time improving processes for preventing and correcting issues with data to improve the quality of data that is provided for decision-making.
  • Data Administrator Responsible for ensuring that all the organizations policies and standards for data governance are assessed and met.
  • Data Custodian Responsible for a development or operational team to make sure that applications and services all meet the organization’s standards and that the team understands the issues.
  • Data Processors A legal entity doing data processing or archiving on behalf of a Data Controller