Managing database changes

Stored procedures are not the be-all and end-all of managing changes in a database, but they can be very important tools in allowing changes to be made to a database whilst not breaking client applications...

Keep your applications running!

In a recent article, I noted that I have had great success in using stored procedures to isolate client applications from database changes. Ensuing reader comments make it clear, however, that using or not using stored procedures to isolate applications from database changes is a hot topic. With that in mind, below are some real-life examples of changes made to a database that do not break client applications.

First rule of database maintenance: Do no harm

The first rule of maintaining a database that is actively used by one or more applications is to avoid breaking existing applications unless it is absolutely necessary.

COM programmers will no doubt be familiar with the concept of interfaces. In COM, an interface is designed as an immutable contract between a component and the consumer of the component. So, if a method named ReadEmployees() in an interface named IEmployee accepts a single integer parameter named DivisionID, the consumer can continue to rely on that method being available in any component that implements the IEmployee interface.

There may be times, however, when an application needs new functionality. In the IEmployee interface, employees may be in many locations, and location-level managers might need the ability to retrieve only employees for a given location. Adding a parameter named LocationID to ReadEmployee() might be useful to ensure that the component will never expose an employee for another division. For a COM component, adding a parameter breaks the interface, so you might create an IEmployee2 interface and provide both methods in that interface. You can then create components that implement both IEmployee and IEmployee2.

When dealing with SQL Server stored procedures, we are much more fortunate. SQL Server stored procedures accept parameters that have default values. This makes adding new parameters less traumatic, and allows new functionality to be handled in backward-compatible ways.

Here’s an example: I was recently asked to expand a system to include not just residents in the department of medicine, but pediatric residents as well. The database was ready for this sort of change, with mapping tables that allowed patients from multiple patient services to be included in a single master service. The database was ready to accept a new master service named PEDI that would include PEDI (pediatric), NEWI (newborn) and NICI (neonatal intensive care) patient services.

Looking at the stored procedure cone, there were a number of places in which MEDI was hard coded. This was done because the organization that originally funded the work did not wish to purchase a system beyond their needs at the time. Adding support for a master service internal to the stored procedures was not a problem, but adding external support was viewed as cost prohibitive.

So there were a number of stored procedures like this (greatly simplified):

Adding a parameter for the master service resulted in a stored procedure call like this:

This stored procedure worked fine for ASP.NET applications that use ADO.NET. (See the note below for classic ADO considerations.)

A somewhat more complex case was a stored procedure declared as follows:

While it is not entirely clear in the name of the parameters, @UniqueID was a GUID used to uniquely identify a user. The initial system that used this database tracked users employing the GUID as a form of indirection, if not security. The GUID was seen as an opaque ID, completely devoid of meaning, and an identifier that would likely not be used to guess some other user’s identifier (unlike a sequential-integer identifier). As often happens, a new system was going to use the database, and wanted to use an integer identifier (already in the database) rather than the GUID. I was able to modify the stored procedure to enable both applications to work well together:

Through use of IsNull(), I was then able to enable the stored procedure to determine whether @UniqueID or @UserID should be used. Using the SqlClient ADO.NET provider, each application could provide one or the other of the parameters and function correctly. Internally, the stored procedure validates that at least one of the last two parameters is non-null.

In some cases, there is no easy way to paper over differences between a stored procedure and the changes that are required. In cases like this, rather than break existing applications I will create a new version of a stored procedure, appending the original base name with a 2 or, if a 2 already exists, a 3, and so on. So “spSaveSomething” would become “spSaveSomething2” and so on if there is a newer version. I commonly find that stored procedures that are adding an OUTPUT parameter are great candidates for a new version.

When adding a column to a table in your database, a change to the stored procedure can generally be avoided by creating the new column with a reasonable default value. You may also discover that existing functionality can continue to work using the default values. Functionality for the new column will need to be added, but you can often leave existing functionality using the default value of the column or ignoring the column altogether.

While not required, I commonly specify which columns I want returned using a row-returning stored procedure. This means that adding a column to a table will not result in additional columns returned in a result set, as it would in a SELECT * scenario. In practice I do not write code that would break based upon an additional column, as might result using the classic ADO GetRows().

The number of parameters is not the only thing that can change an interface, of course. Sometimes, the meaning of a stored procedure, or more frequently, its side effects, will require that you create a new version of a stored procedure. A new version of a stored procedure might change, for example, from accepting a username and password in clear text, to accepting a username in clear text and an encrypted password. More important, a stored procedure might need to do additional tasks that an original application will not expect.

Classic ADO considerations

One of the major differences I noticed when moving to the SqlClient ADO.NET data provider is that parameters are named, rather than positional. In classic ADO, by default you need to add parameters in exactly the same order as they appear in the stored procedure create statement. You can provide names for the parameters, but they have no significance, except if you need to refer to the parameter object again in your client application, like when you retrieve the value from an output parameter.

As of ADO 2.6, you can set the NamedParameters property of the Parameters collection to true, and parameters will be named much like they are in the SqlClient ADO.NET parameters. I have never bothered with this, and I don’t recommend relying on a particular version of ADO for most applications. For server-based applications in which you have total control over the database and web servers, this might be a reasonable option.

The positional nature of parameters has implications for making modifications to stored procedures. Suppose in the example above I changed the signature of a stored procedure from:

to:

There are scenarios in which this change will cause problems. If you modify the classic ADO application that calls this stored procedure, for example, and if there is a code path that sends only the @MasterService parameter, there will be an error, because the string you send as the @MasterService will be interpreted as the @ResidentID parameter and will not likely be converted to an integer. No runtime error (though likely confusion) would result if both parameters were strings and you sent only what you intended as the @MasterService parameter, since it would be interpreted as the @ResidentID and might result in undesired activity.

When using OleDb to access a SQL Server, parameters are also positional. You can provide names for the parameters, but they are ignored. Parameters also need to be added left to right in the SQL String as they appear.

Changing databases, working apps

Stored procedures are not the be-all and end-all of managing changes in a database, but they can be very important tools. I commonly support two or more versions of a web application in a given database by providing backward-compatible stored procedures, or in a pinch, versioned stored procedures. By carefully enforcing the rule that, wherever possible, changes should not break existing applications, you can reliably enable your database to change and grow while existing applications continue working. And that is something I think everyone can agree is a good idea.