Planning for a Successful Database Lifecycle

Many of the problems that are faced in the course of developing database-driven applications are nothing to do with the delivery team at all, but are the result of poor preparatory planning work. The delivery process relies on other prior IT activities and especially the ongoing collaboration with the governance team. Some dramatic development failures have, for example, been entirely due to a failure by the IT Governance team to understand the organization(s) that are destined to use the application; sometimes even if it is their own organization that has the intended users.

If the necessary groundwork
isn’t done first, no architecture
can save the subsequent
building.

It is sometimes hard for the developers within a delivery team to accept that they rely on anyone else for the success of their work, but in fact, a successful application depends on close teamwork between the three key activities, Governance, Delivery and Operations: the opportunities for failure are widespread, and often the source of failure is so subtle that the developers, by default, have the odium of failure unfairly heaped on them. (See OASIG 1996 and Software Development Failures: Anatomy of Abandoned Projects By Ewusi-Mensah). If the necessary groundwork isn’t done first, no architecture can save the subsequent building.

This article is aimed a describing the work that needs to be done, what is often described as the ‘required steps’, to ensure the best possible chance of success for database projects throughout their life.

Why the need to describe the required steps?

Although there was a great deal of agonizing in the 1980s about the difficulty of developing database-driven applications, (see Perrow, C. 1984 -Normal Accidents: living with high-risk technology; Basic Books, New York.) things came to a head in the 1990s, when the general skills-shortage within the industry caused damaging failures and over-runs in software projects. There was a particularly low point in the 1990s when the ‘software crisis’ resulted in fewer than one in five software projects being successful.

‘… every component of good engineering
practice had been ignored, every guideline
of software engineering disregarded,
basic management principles neglected,
even the dictates of common sense
overlooked.’

Finkelstein & Dowell:
A Comedy of Errors:
the London Ambulance Service case study 1996

The Standish Group of U.S. companies and government agencies found, in 1995, that a third of corporate IT projects were abandoned before completion, at a cost of $81 billion. The Standish Group reported similar figures for 1998. A common theme was the chaotic state of software development. Out of the subsequent heart-searching, both in the UK and the US, came several initiatives, including Agile, Information Systems Lifecycle (ISLC) or the Software Development Lifecycle (SDLC) and Application Lifecycle Management (ALM). Specifically for databases there was the Database System Development Lifecycle (DSDLC), and closer to ALM, there was Database Lifecycle Management (DLM).

Because these separate initiatives aimed to fix the same general problem, there was a lot of overlap. What made ALM and DLM attractive was that it set out the development and delivery process within its wider context. This made it possible to identify many of the management failures that were responsible for making the work of delivery more difficult, and the delivery shortcomings that made a database-driven application more difficult to manage subsequently in production. It could also identify issues in operations that could cause a perfectly good development to hit problems in production.

ALM and DLM were based in turn on Product Lifecycle Management (PLM) which had revolutionized product development in manufacturing industries in the late 1980s. These techniques allowed different teams to see in far more detail what was happening in a project, and allowed better cross-team discussion and participation. It also spelled-out some of the processes that required cooperation between teams to make them happen.

DLM and Process Improvement

DLM aims to provide the optimum environment to allow a range of development methodologies to thrive, such as Agile and XP. It values experimentation and innovation, and aims to provide faster feedback of results. It does not prescribe a particular technique. It merely aims to make the whole lifecycle of the database repeatable, controlled, predictable, defined, measurable, visible and optimized.

It aims to build a culture that values process-improvement. It does this by means of more effective planning, automation, collaboration, cross-functional teams, an integrated toolchain, and an output that is tied closely to the aims of the organization.

Where possible, it encourages measurement to quickly give feedback about process-change. DLM is rather different from ALM, because databases, by handling business data directly within a variety of constraints and conditions, tend to be nearer the core of the organization that uses it and require more work to have been completed before delivery can begin to ‘cut code’.

No organization can make the cultural switch to DLM rapidly. Most profoundly, because it acknowledges the extent of the contribution of governance to successful delivery, it makes the governance process more accountable for its ‘deliverables’ in the same way as Delivery and Operations. The cultural switch aims to move the participating teams, perhaps from the chaotic, possibly the unstandardized through to the defined and thence via the measured to the optimized. By homing in on a particular aspect of delivery or operation, such as issue-tracking, or the delivery pipeline, and assessing its current maturity, it is easier to identify the next stage to be reached with the aim of continuous process-improvement. This process can be formalized with the help of a Capability Maturity Model (CMM) that provides a framework that helps teams to home in on the areas where improvements to the processes will give the most gains in effectiveness.

The ‘Deliverables’ of Governance at the commencement of a database project

IT governance is a cross-project activity. It is responsible for the broad perspective of IT strategy, rather than the requirements of a particular application or project. This sounds rather nebulous, but it means that governance will be better placed to provide the context and history within IT as well as an understanding of the organization. When a database needs to be delivered, certain preliminaries need to be decided or determined before any effective work can be done, and Governance should be well-placed to tackle this. Without these preliminaries in place, the predicament of delivery is like that of a bricklayer who discovers that not only are there no architectural plans but no idea where the services are, or how the site can be accessed.

Before delivery can commence, The Governance team needs to have determined the context of the application. It should have identified the main business needs and provided an overall data architecture. It needs to have decided on how these business requirements can be met with a mix of bought-in, commissioned and internal development work. There must be a clear idea of the business priorities for the database.

If there is to be in-house development, governance must bring their planning role to assist development teams with up-front planning for development. Governance must identify all enterprise-wide data and processes shared by applications, and plan data interfaces and feeds. It must ensure the participation of all the necessary IT expertise within the organization for security, compliance etc. Governance must also agree a resilience plan with the Operations activity in conformance with the standards defined by the organization. In a DLM-based organization, the ‘deliverables’ of the governance process are often tested by the delivery test team for consistency and completeness.

Let’s itemize a few of these tasks. For a small project, it is little more than a tick in a box, but within a corporate setting it may prove a major hurdle. Whatever the size of the development, though, these tasks still exist.

Identifying Off-the-Shelf Alternatives

A wise organization never develops a database-driven application if there is an adequate and cost-effective alternative available. You wouldn’t want to reinvent and maintain your own payroll system for your company would you? This may seem obvious, yet many an organization has fallen foul of this error. There may be good reasons for creating a system whose functions are duplicated by a commercial system but this is unlikely. If the reasons exist, they have to be carefully articulated and documented. Commercial organizations all share tasks that are uncannily similar and it is far better to focus IT effort on the specialized parts of the organization. At the same time, even if a database is to be created, the actual RDBMS or NoSQL product to be used, needs to be selected in the light of the requirements of the application and the cost of hosting, maintenance and support, and also whether it is likely to be on premise or cloud-based. This will be part of the technical architecture that is developed in collaboration with the delivery team.

Data Retention Requirements.

It is never the case that personal data, data about individuals, can be held forever due to data protection legislation. Depending on what is held in a database, there can be several requirements which may even conflict. I once had to deliver an application where the national government required data to be retained for five years in case it was needed as evidence in law. At the same time, European law required that it be removed after a year. Several legislative frameworks specified a two-year retention. After some head-scratching, we decided to remove data from the database after a year into a ring-fenced archive that could be accessed only with a written directive from IT management.

This activity is slightly detached from compliance, because by no means all decisions about retentions have a basis in the legislative framework, but could just as easily be business decisions or IT policy.

Data Classification for Access Requirements.

In most organizations, there are rules to ensure data integrity and security that determine which members of the organization, in which role, can access, update or delete various classes of data. This process of classifying data was carried-over from the manual systems, but is still important, and each class of data will have its organization-wide business rules. Organisations generally adopt a legislative framework that makes sure that the way that you manage data integrity complies with the current legislation and industry practices. Data retention practices, data access, security, auditing and resilience all have to meet a legal standard. To make this simpler to manage, the data is classified into groups according to the restrictions put upon it. The most obvious types of data that require special access control and security regimes are personal data, financial data and corporate data. Any data that is used by the application should be checked against this classification to ensure that it is stored and handled appropriately. Not even a startup with a simple data model is immune from this.

Master Data Identification – The canonical source of Data.

All data is ‘owned’, and much of the data within an application will be pre-existing, ‘owned’ by another agency. In other words, it comes from an external source beyond the application. Even the meanest database is likely to have ‘static’ data, such as a list of languages that comes from a published source and is maintained by them. You are unlikely to want to maintain a list like this yourself: you rely on the ‘owner’. As database applications get more complex, subtle problems can emerge if you are not clear about where the master version of all the different data that is copied into a database is held. This data can only be effectively altered or deleted in the ‘master’ or canonical source; (also known as ‘publisher’). Any Extract-Transform-Load (ETL) process can become extraordinarily arcane if this is not clear and well-documented.

Service-level requirements

Any organization will have a clear idea about how long it can afford to be without its IT services. Untoward disasters are a different topic to predictable down-time, and service-level defines the amount of software and hardware effort that should go into designing systems with a minimal downtime. The architecture of an application will be quite different if it needs to be highly resilient, meaning that it is designed to avoid any possibility of down-time. Systems for the aviation industry where failure would lead to loss of life will be very different in nature to a software game, for example, and the test regime within delivery will be far more complex. The Service-level must therefore be determined before delivery can begin, since resilience cannot be retro-fitted into an existing system.

Compliance with Legislative frameworks.

Developers seldom wish to know the details or reasons for the need for regulatory compliance. The governance process must translate the appropriate frameworks into a simple ‘cookbook’ in advance of delivery by governance. Every company should have, and adopt, a compliance framework that is detailed enough to allow the delivery team to operate a series of checklists to make sure that all the work is compliant and no redesign is ever necessary. A database that clearly defines the scope for sensitive data within an application is very different in design from one where data is not subject to legislation. Few legal requirements require explicit documentation, but compliance is a business issue that has to take account of trade practices, union agreements, national law, international law, insurance conditions, and a variety of other factors. All this has to be distilled into simple ‘framework’ instructions to make compliance easy and effective.

Determining data consumers and providers.

Databases are a great way of making data accessible across applications, but there is a limit to the extent to which any type of database can become a generic data server. This is a more specialized role of a data warehouse. More commonly, a database will publish data which is then used to update subscribers. ETL processes are more generally used for this purpose, although in a server environment which is a homogeneous SQL Server, replication can be used. It is easier for the delivery teams, who have to build such a system and operations who have to maintain it, if this system can be planned up-front without the need for many ad-hoc additions. Where data is served via OData, a Rest-API or other HTTP-based protocol, the resilience, maintenance and operational considerations need careful planning. A retrospective request close to project completion is seldom welcomed by the delivery team.

Disaster-recovery planning.

This is a topic where Governance, Delivery and Operations must work closely together. Disaster recovery is related to service-level, and also relates to the speed at which a service can be restored after an unexpected disaster, the so-called ‘Act of God’. Whether the floods, earthquakes, lightning strikes, man-made disasters or acts of war can be ascribed to an angry deity or not, it is likely that the angels are on the side of the IT people who plan and implement carefully thought-out disaster recovery plans. Any organization must have a Business Continuity Plan (BCP), which is much broader in scope than the IT applications and data. Such plans deal with key personnel, facilities, crisis communication, and reputation management. The IT Disaster Recovery Plan (DRP) is usually held by senior management as part of a BCP, and a part of this is the Data Recovery Plan. Individual applications and databases will have different requirements, depending on the nature of the data and the Service Level Agreement (SLA), and the applications must be designed in line with the data recovery plan. Even small, apparently minor, databases must have an effective disaster recovery plan. A failure to do so can, for example, invalidate an insurance claim.

Data architecture

All organizations must have, and maintain, a ‘data model’ that provides a birds-eye view of the data, data-flow, and processes within the organization. In the early days of commercial IT, this was often referred to as the ‘computer manual’. The maintenance of such a model is an implicit part of any IT department’s mission. Without it, survival of the organization would be difficult and, if legal problems ensue, ignorance of organizational data is no excuse in law. It has become a mistaken dogma that a ‘business-domain’ has its own hermetically sealed edges, and arcane rules and language, but in reality every organization knows in broad terms the nature of its data, the location of the data, and the vocabulary of the data. It will be aware of the processes that are applied to the data. It is certainly possible that the enthusiastic stake-holders for a particular application will have little idea, but that is another matter. Governance must tackle the task of determining exactly how any new application fits into this overall data architecture. By doing so, Governance can provide the delivery team with a very reasonable idea of the scope of the data, sufficient to refine the cost-estimate for the project.

Preliminary scope

It would seem obvious that a delivery team must be aware of the scope of the project that they are engaged with. It is a source of some bewilderment to the seasoned developer to discover a project being started before its scope has been specified, agreed and signed-off, but it happens. The definition of scope does not have to be precise at this stage but it must be sufficient to allow cost-estimates to be reasonably accurate and timescales to be close to reality.

Up-front planning and costing

With the input of the data architecture and the preliminary scope, together with service-level requirements and the data retention plan, it is easier to calculate a reasonable estimate of costs and timescales. These estimates will change according to the size of the teams involved, particularly developers and testers. This planning and costing exercise is most easily presented as a standard project-management model that can come up with calculations based on the staffing levels of the project, the dependencies, and slippage in the completion date of any of the deliverables on which the development team depends, as itemized already in this article.

The ‘Deliverables’ of Operations at the commencement of a database project

Although some of the more traditional of IT managers will recoil in horror at the idea of operations having a voice in project initiation, the truth is that a DevOps culture of collaboration and consideration that stems from ALM best-practice has enormous value in ensuring the health of a database application throughout its active life. Databases and database-driven applications must be designed for production. Operational concerns must be factored into the design, and the experience of maintaining and supporting applications has enormous value in getting database applications right, and economical to run.

Aside from the DevOps initiative, Operations must work closely with Governance to make sure that the hosted production database meets DR plans and High Availability Service-level agreements. Beyond this, operational teams are the obvious source of detailed plans for hardware platforms, along with server and network configurations. At the same time, operations would need to work with both the delivery team and governance to prepare plans of how to host the pilot version of the database. Although governance is responsible for providing the ‘deliverables’ documentation, which I’ve itemized, to the delivery team(s), it will be with the advice and input from operations on the requirements of maintenance, monitoring, support and training. At this stage, it will not be in great detail, but sufficient to give accurate timescales and costs.

Operations will need to work together with governance to establish the likely demands for scaling and evaluate alternatives for achieving this. They will also need to establish a Configuration Management System (CMS) archive for the project, based on a Version Control system (VCS).

Training and User Documentation Requirement

For any application that requires staff or members of an organization to work with it, there will be a training requirement of some sort. Training materials and courses need to be developed, and delivered in good time. This is usually given to operations as a responsibility, and has to be planned and costed. It can, in some circumstances, turn out to be a considerable expense, on the same scale of cost as the actual development. At this stage, the plans for the delivery of training needs to be prepared in sufficient detail to understand the effort and timescales involved. Because of the level of detail required for training materials, it is unlikely that this can commence long before release, but if the tech authors work closely with User Experience (UX) teams and developers, a great deal of time can be saved with this.

Support Requirements

The users of a system are likely to require a measure of support, usually from a helpdesk that already provides support for a number of other systems. Most applications spend the majority of their life without being provided with any support from the delivery teams. This is done instead by Operational Support/Helpdesk staff. Their requirements for this task are seldom met adequately, even though they are easy to provide if they are known early on in delivery. The delivery team may be able to exploit a good liaison between Development and Operations to choose to create a single monitoring console for both development and long-term support that can sort out all common user problems.

Support is ‘tiered’ to distinguish first-level support from support that must be escalated. This escalation procedure needs to be worked out in advance of the system going live.

Maintenance requirements

A database can spend most of its operational life without the close attention of a delivery team and will rely on maintenance by operational staff. For a system that is highly available, this will mean, that as little as possible should be assumed of the technical knowledge of the operational staff fixing a problem. This means that as many of the common maintenance tasks as possible must be automated, and all of them will need to be provided as a step-by-step guide. Maintenance teams are also often provided with a ‘manual’ with all the logged warnings, errors and exceptions along with an explanation of the likely cause and amelioration. (E.g. warnings of inadequate disk-space and the necessary fix!).

Escalation procedures

Both support and maintenance activities require documented escalation procedures for service requests, incidents and logged issues. These need to define what role needs to be informed of events based on severity, (e.g. production critical or production down, as compared with a usage issue with a work-around) or what role should deal with different classes of incident or support request. It also needs to define up-front the chain of alerts that need to happen if nobody of a particular role (such as support Database Administrator (DBA) has responded. It should define acceptable response-times, and all service procedures.

Issue reporting and tracking

Automated issue-tracking systems are vital to production, and are extremely useful in delivery as well. If the issue-tracking mechanism is devised early on in the life of the system with all requirements for operations, delivery and tests accommodated by the system, it is likely to save a great deal of time. The database system should be able to feed issues into the issue-tracking software as well as to the participating teams. The sort of issues would include all database events that should result in an alert, such as over-running scheduled processes, SQL batch compile errors, long-running queries and deadlocks. This requires a great deal of care to prevent a duplication of issues flooding the issue-tracking system.

What the delivery team must provide at the commencement of a database project

The delivery team of any size will have, or be able to call on, specialized expertise in User Experience (UX) and design, Test, technical authoring, and technical architecture. Larger teams may be able to call on expertise in configuration management. This is likely to be shared across projects, and it takes skill to make sure that the right expertise is available at the right time.

Creating the delivery pipeline

Though it is a pleasant thought that developers should be able to lean back in their chairs and yell ‘Let’s ship it!’ the reality is a lot more complex. The deployment of a major application is like launching a ship, and the delivery pipeline is like the slipway.

Even a reasonably simple trading website needs a deployment pipeline even if it is fairly rudimentary. At the other end of the scale, a database-driven application whose failure can cause loss of life, or financial disaster, requires many checks, release-gates, and division of responsibility. Whatever the scale, it is usually best to create and refine the process well before it is required. This will involve the solemn delivery of the ‘Hello World’ application through all test regimes, sign-offs, release-gates, approvals, through staging and delivery to production. Once the process is determined and agreed with all the participants, it must be automated to make the process visible to all participants, and make it measurable. It must be flexible-enough to allow improvements and amendments.

Delivery will take the lead in this but in cooperation with operations and governance. All three activities will have an essential role in providing an effective delivery pipeline.

Refine the technical architecture

Any initial technical architecture will be relatively short on detail, but should, if done properly, provide a good framework for discussion and elaboration, and have more of an accent on the requirements rather than the technical solutions. It is likely that the technical architecture will develop in many ways with the input of delivery and operations. The test team within delivery will also have an input into the technical architecture, particularly where they are closely concerned with soak testing, performance testing, limit-testing or scalability. Some test teams are extremely skilled in spotting where information is missing or there is logical inconsistency in the specification.

Selecting a toolset

The choice of toolset is nowadays more open, and less inevitable than before. ALM as a development initiative was wrecked by the software publishers who convinced the market, that you needed a single unified toolset across the whole application lifecycle from one provider. It became a futile race for the most comprehensive Integrated Development Environment (IDE) because developers have a cultural distaste for being ‘boxed-in’ by a single supplier. The rise in the ecosystem of tools that were best-in-class but were able to communicate with each other has since changed the way that developers work. Instead of staying within a single IDE, many teams will now happily mix such diverse tools as Atom, Jira, Bugzilla, Teamcity, Git, DocuWiki, Slack, Gitter, SSMS, SQL Compare, Mantis, as well as Visual Studio, IntelliJ IDEA or Eclipse. There is much more freedom of choice of tool to assist in the database lifecycle.

Although there is less of a need to decide on a final selection of tools before development starts, it is useful to have a reasoned plan in place, and reserve time to check out the market for niche development tools in the light of experience with the project.

With DLM, the big difference is that the toolset must accommodate the requirements of all the participating teams, rather than just the developers within the delivery team. There are obvious advantages, for example, if the same issue-tracking tool can be used for both development and operations, and that scripts for maintenance jobs that will be used in production, use the same tools as the development tools. In the delivery pipeline there is a particular need for effective communication between participants and this will require special care.

Provide the roadmap for iterative development.

DLM aims to be effective with any development methodology but many of its aims are more easily achieved if development proceeds step-wise, with clear objectives for each step. Delivery is achieved iteratively in a series of increments or steps, each of which deliver new functionality. This would provide a clear progress towards the system’s final functionality. These iterations should be short and, if they are of regular size, will help to understand the progress of the delivery team, and hopefully successively refine the estimates of the remaining work. Each iteration should adhere to a similar pattern and should be able to demonstrate objective progress but should not be extended merely for this purpose.

At the start of development, the team needs to be clear on the processes, systems and vocabulary that underlie an iterative development.

Generally speaking, an iterative development will proceed in a regular way that allows for continuous process improvement, feedback and monitoring of progress.

Each iteration is fleshed out at the start by means of a planning meeting with the delivery team and governance. The objectives and tasks for the iteration need to be planned by all participants to ensure that there is agreement on what is expected in terms of delivery at the end of an iteration, and why. Governance will be particularly valuable in helping to assess the repercussions of any new work item on the current iteration. If new work cannot be delayed until the next iteration, the priority of added scope has to be negotiated carefully by all parties, so as to accommodate it against existing work items.

This meeting needs to define the objectives in terms of work items and their component tasks. The priority of individual tasks should be determined. Each work item needs to involve completion, right through to potential delivery, including all unit testing. Individual team members will sign up for tasks and estimate how long they will take. Once this has been agreed, then the iteration should be allowed to proceed with as few changes to this plan as possible

During each iteration, the delivery team needs to share details of progress on active tasks, and address any slippage promptly. The team will need to gauge progress and detect issues in order to cooperate on a solution in a way that encourages team members to actively seek advice, specialist expertise and help. Although regular meetings are useful for this if they save time, they can be unwieldy in a DLM/ALM setting because of the requirement for governance and operations to be able to see progress and status via an iteration plan that logs progress.

At the end of the iteration, any incomplete work items are returned to the work items list, and usually included in the subsequent iteration. The team will need to assess whether the requirements that have been flagged as complete, have passed test cases. Each iteration will probably include a review and retrospective, so as to check for opportunities to improve the process and flag up any issues that could affect delivery.

Iterations should result in the delivery of fully-tested software, that can be demonstrated to the users and stakeholders, and possibly released.

Plan out the detail of all supporting ETL and reporting systems and services.

The development work on database systems can quickly become gridlocked unless dependencies between applications and their databases are reduced to the minimum required, and contained within an interface. Although, in the popular imagination, there is an idea of a simple one-for-one relationship between a database and its application that can be kept in step by keeping them together in a single source-control system, this model breaks down rapidly with increasing complexity. It is safer to first identify all the likely ‘customer’ and ‘provider’ applications; Then we need to negotiate, with their delivery teams, a specified interface that allows each system to evolve independently without inintentionally introducing dependencies. This work must be done at the start of delivery. If a single team is designing a new database to serve a user-application, then only one interface is required.

There are, however, likely to be reporting and warehousing systems that are downstream of the database application that will need to be served data in a way that is a matter for discussion between teams, but it should never be left to chance or an ETL system that grows organically like barnacles.

Identifying existing applications that could become customers or providers of data should be easy with a corporate data architecture, and particularly easy if the governance process has already determined the data consumers and providers.(see above)

Ensure that development is a managed process.

DLM bases much of its promise on improving delivery processes by making it more reliable, repeatable and consistent. In general, development of a database application evolves along a spectrum that ranges from the chaotic through to the optimized, and to do this, delivery must become more standardized, clearly defined and measured.

To do this in more detail, it would help in the planning if one takes a realistic view of all aspects of delivery and assess where the team is in the spectrum, and what steps are required to refine the process to the next level. There are a number of techniques that assist with this, that are beyond the scope of this article.

Determine all database requirements to comply with the legislative framework

If all has gone well, a delivery team will be furnished with a relatively simple list of what is required in order to comply with the regulatory framework that the organization has adopted. The next step is to determine what data needs to be held in the database. For this, the data architecture is essential. Once this is done, then it should be possible to be sure of the data retention requirements (see above) and the data classification for access requirements. (See above). This will then allow the delivery team to map the data according to its access level and ensure that the database meets the organization’s compliance framework.

How is this organized?

Who provides the governance to Governance?

Although project-based deliverables can be tracked by a project manager, there is often a problem in IT departments with deliverables such as business continuity plans or data classifications that are shared throughout many different activities. If, for example, there isn’t an adequate framework for regulatory compliance, or the organization’s data model is insufficient, it is difficult to correct the problem and a project can face over-runs as a consequence. Although it is a beautiful thought that one can have highly organized and process-conscious delivery teams within a department that is essentially chaotic, it simply doesn’t work in practice.

A well-functioning test function can bring to bear its disciplines as testers on documentation, models and frameworks just as easily as on software. Where test teams have been given the task of checking whether these ‘deliverables’ are actually fit for purpose, experience has shown that it can be an effective way of ensuring an adequate quality in what is provided for delivery.

What happens when information is missing?

So often, when a member of a delivery team is required to find out about the technical architecture, data model or production requirements, the person who is responsible will tap their head and say ‘Don’t worry, it is all there in my head. Just come to me and ask if there is anything that isn’t clear.’ In fact, if something isn’t documented, it doesn’t exist, and any vague contents of the head is lost when the member of the organization leaves or gets moved to another department. A ‘Model’, whether about data, architecture, support or any other organizational process, will consist of a written description, overviews, roadmaps, Unified Modeling Language (UML), Entity-relationship modelling (ER), Structured Systems Analysis & Design Method (SSADM) or other standard diagrams, catalogues and indexes written to a fairly standard format. Many will protest that such a document is never read. In fact, if it is useful, correct, and maintained, then it will be used. The objective is, after all, communication. The alternative is excessive duplication of effort, a breeding-ground for misunderstanding and endless re-engineering of code to correct business logic. In other words, chaos.

What are the practicalities of maintaining models, plans and documents?

DLM aims to improve delivery by improving processes and improving the expertise of the teams. Information is sometimes relevant to everyone, sometimes only for a particular application or database, and sometimes only for a team or activity. These all require different ways of maintaining a record.

Documentation must only be done for a purpose: Is it to allow software to be refactored or maintained? Does it define a contract model? Is it purely for audit purposes? Is it done to make support tasks easier to perform without calling on specialized expertise? Is it to define a technical solution for a series of applications, such as single-sign-on? Is it to define organizational policy? Is it a document that could be used in litigation? All these types of documentation require a very different approach. Documentation of software components and database schema, for example, is best held as close to the code as possible, whereas an organizational, technical or data model is likely to be an intranet-based document and a collection of diagrams.

Any document that is maintained by governance will have a purpose, a major part of which will be to make it much easier to do any software delivery. It is essential to preserve ‘organizational memory’ effectively, but isn’t always the best way of coordinating aspects of a delivery, or to record a particular system. It must stick to facts rather than speculation and is best done by evolution, with regular revisions as understanding of the subject-matter is refined with feedback. It must be trusted by those who use it. It should be as concise as possible, and should be regularly tested rather than rely on custom or flawed memory.

The task of maintaining an organizational model has to be costed so that the organization can be aware of the scale of investment. The creation and maintenance of the model must be explicitly agreed and signed-off by management.

All shared documentation that is used by an organization should be in version-control for the simple reason of being able to prove what was known at any particular time, in much the same way that official typed documentation was always date-stamped.

Wikis also offer a measure of audit and specialized Wikis for documentation can be a very effective way of maintaining documentation, particularly if they allow discussion and comments.

Documentation that is directly concerned with the code, build and configuration of a database application should be sourced in the same version control system as the code itself as long as it has the same level of access rights as the code.

Opportunities and problems with automation.

In a DLM project where the DevOps culture prevails, there is likely to be a great deal more scripting, using shell scripts, command batches or PowerShell. This will involve not only aspects of the application itself such as ETL, but also the configuration of networks, cloud assets and servers. It is likely to be intensively used to automate testing and deployment.

These scripts will need the same disciplines as the source code. They will need to be in version control. However, it is not always appropriate for them to be lodged with the application source. On one end of the scale, access to the scripts must be restricted if they contain logins or credentials, or if the access to sensitive information is made easier by an unauthorized user by scanning the scripts. Also, it makes it easier to regulate access to these materials if they are stored in role-related archives such as a Configuration Management Archive, or a support Archive.

Scripts are best signed, so that only the user with the right authorization can run them. PowerShell scripts, for example have great potential power and there are several safeguards to prevent PowerShell being a security black-hole. Without a restrictive access policy and other security in place, automation cannot be implemented without security risks. It is best to plan in advance to make the burden of security and access-control manageable.

Conclusions

To provide database-driven information systems to organizations of any size and complexity, delivery teams and operations need high-quality information when they need them, where they need them. If any delivery of a database is to be effective there must be careful planning and analysis based on process modelling, analysis of options and a good broad understanding of the nature, source and constraints of the data being used.

There must be no
need for heroics,
or pizzas in the
night.

Iterative development works best when the right information and assistance is provided for the designers, testers and developers of the delivery team, and so continuous delivery of database functionality cannot happen merely from individual heroics from developers with mutant-like special intellectual powers. The magic can only happen with close teamwork that can work in a repeatable way, that continually refines its delivery processes and organizes the work into discrete ‘deliverables’. Such a team can only work effectively if the broad sweep of the business domain is well understood, so that nothing has to be redone due to a misunderstanding of the requirements. There must be no need for heroics or pizzas in the night.

Why the need to describe the required steps?

DLM and Process Improvement

The ‘Deliverables’ of Governance at the commencement of a database project

Identifying Off-the-Shelf Alternatives

Data Retention Requirements.

Data Classification for Access Requirements.

Master Data Identification – The canonical source of Data.

Service-level requirements

Compliance with Legislative frameworks.

Determining data consumers and providers.

Disaster-recovery planning.

Data architecture

Preliminary scope

Up-front planning and costing

The ‘Deliverables’ of Operations at the commencement of a database project

Training and User Documentation Requirement

Support Requirements

Maintenance requirements

Escalation procedures

Issue reporting and tracking

What the delivery team must provide at the commencement of a database project

Creating the delivery pipeline

Refine the technical architecture

Selecting a toolset

Provide the roadmap for iterative development.

Plan out the detail of all supporting ETL and reporting systems and services.

Ensure that development is a managed process.

Determine all database requirements to comply with the legislative framework

How is this organized?

Who provides the governance to Governance?

What happens when information is missing?

What are the practicalities of maintaining models, plans and documents?

Opportunities and problems with automation.

Conclusions

Article tags

Recommended

About the author

William Brewer

William Brewer's contributions

Articles

Books

Top topics

William Brewer's latest contributions:

The Staging Phase of Deployment

Fifty Different Ways to Enact Data Privacy Laws?

Information Security in Practice