Scripting the Description of Database Tables Using Extended Properties

Stored procedures, for example, are very easy to document. The comment block at the beginning stays with the code and a CREATE or ALTER script contains everything to reproduce the proc. SQL Server tables, however, are more difficult to document. You can use Extended Properties to document columns and constraints, but working with Extended Properties is difficult at best. Phil Factor demonstrates ways to easily add Extended Properties to your build scripts.

You can make all the difference to the ease of the task of developing a database by using Extended Properties as documentation. Why aren’t they universally used? Because they are a pain to script and maintain. I’ve spent too long enduring the difficulties, and I’ve even published a few useful ways of managing Extended Properties, but a recent experience in managing the documentation of tables made me decide finally to try to do something about it. After some rather unsatisfactory but workable approaches, I hit on a JSON-based solution that suits me fine. It is possible to create a solution without JSON or a recent version of SQL Server, and I’ll show you how later on in the article, but it isn’t as neat.

Why Use Extended Properties for Documentation?

Once you’ve got into the habit of using Extended Properties to document your database, there are obvious benefits:

  • You can explain why you added that index or modified that constraint.
  • You can describe exactly what that rather obscure column does.
  • You can add a reasoned explanation to the use of a table.

You will often need these explanations because, sadly, DDL code isn’t ‘self-documenting’, and human memory is fallible. Extended Properties are easily searched because they are all exposed in one system view.

It is great to add explanations to lists of procedures, functions and views once the database becomes sizeable. Extended Properties are useful when exploring the metadata, but the requirement isn’t quite so essential because comments are preserved along with the source code. Tables, however, are a big problem because SQL Server throws away the script that produces the table, along with all the comments. The reason that this happens is that there are many ways you can alter parts of a table without scripting the entire table. How could one infallibly preserve all these ALTER statements in the preserved script? It’s tricky. Table scripts that you get from SSMS or script via SMO are therefore synthesised from the system tables but without those comments or even Extended Properties.

If you’ve added documentation, you get more information when you explore the metadata of a table and can quickly get up to speed with it. With the help of a simple inline table-valued user defined function, for example, you can get useful info. In AdventureWorks, just to demonstrate this, I can use this query (Download the functions and stored procedure required to follow along at the bottom of this article in the section marked ‘downloads’):

… and get this information.

It is still worth documenting database objects other than tables. If you just rely on embedded comments within procedures, functions and views, you wouldn’t get handy information listed like this. It also means you only have to document the database once and extract reports from it as necessary.

The Problem of Using Extended Properties for Documentation

I’ve experienced four problems for maintaining documentation:

  • Writing Extended Properties in the first place
  • Maintaining them over time
  • Preserving them in source control
  • Generating a build script for when you create the table from code, or when you refactor by dropping and replacing the table

I use SQL Doc for the first two tasks, of course, but the third and fourth must be done by scripting them out. Scripting all those Extended Properties can be a nightmare if you’ve been conscientious in creating them.

From these general difficulties, what are the essential tasks that are particularly tricky? In my experience, it is

  • Knowing what is and isn’t there in the documentation of your objects in the first place
  • Altering several existing Extended Properties, especially when you are doing it in bulk and something gets renamed
  • Adding several Extended Properties in a single operation
  • Scripting out Extended Properties, either for documentation or for source control

Updating an extended property in SQL for a migration script, for example, can be painful. You would get an error if you updated an extended property that didn’t exist or if you added one that already existed. To get around that, you’d first filter out all those properties that had blank strings and execute either an ADD or UPDATE depending on whether the property already existed. In this example, we update just the documentation for the LastName column on a new table, dbo.Person. (Script included with the article)

Scripting the Extended Properties

When you generate a SQL script for a table via SSMS or PowerShell, you get a horrible mess of cluttered code to create each extended property. For an entire database script, this can bulk up the file to over twice its size. To appreciate the mess, just script out AdventureWorks to see what I mean. It takes five hundred words just to script out the Extended Properties of a single table (I chose HumanResources.Employee to check this out.) That script is developed entirely for machines, not us. This type of script soon becomes impossible to use or maintain. It adds a lot of bulk to your table scripts. Adding an extended property via script is next to impossible and finding the extended property you wish to change is a trial. You will strain to determine where documentation is lacking.

Here are three of the Extended Properties, just to show you a flavor

It also means that documentation is part of the build process, embedded in the DDL for each database object, rather than managed as a separate chore. We can reduce this by a half by using a stored procedure to do the work.

The HumanResources.Employee table is rather daunting so, to demonstrate this, we create a simple table and then add all the Extended Properties. To try the script out, you must be on SQL Server 2016 or 2017 and will need to install the temporary stored procedure #AddOrUpdateTableDocumentation in the same session. The script for the procedure, and all other code I mention  is included with the article in the Downloads section at the base of the article.

To see how SMS would do the same this, just right-click on the table in the browser pane once you’ve run the code that created the table.

Yes, it is far more difficult to read. More to the point, it doesn’t actually help you to add the documentation. Also, the script that we use via the #AddOrUpdateTableDocumentation procedure can be used to update the documentation. If you have a newly-built table without any documentation and you want to add the code that attaches the Extended Properties to the table after it is built, then all the values for the Extended Properties would be blank. In our case with our example Person table, it would be….

You’d then want to fill in all those blank values for “doc”:”” as I did in the first listing. It doesn’t take long! If a doc field is left blank, it isn’t touched.

Did I write all that JSON? No of course not. I generated it with a function that builds whatever documentation it finds, in whatever tables you specify, and leaves any undocumented object with a blank value ready for you to add.

First, before I show you the function to do this, I need a utility function, ListOutTableEPs, that provides all the details of the columns, constraints and indexes for a table. You’ll see that there are certain subtleties: For example, you don’t want to script out documentation for indexes or constraints that are system-named, do you?

This function is only required for preparing a build, or in generating the script for updating an existing table. I’ve put the table-valued functions in DBO just to keep things simple for the demonstration. You’d obviously deal with them by having a utility schema or dropping them after use. If you cannot have utility iTVFs, even temporarily (nothing wrong with this as we are firmly in development here using development servers), it is possible to ‘inline’ it as a large batch from an application. I do this inside PowerShell as shown later in the article. Sadly, you can’t create a temporary iTVF as you can with a stored procedure or table.

Now, with this function in place, we can use it to create the JSON file, or data file for the procedure that adds the documentation for the table. When we first run it with a newly-created table, there will be no Extended Properties giving documentation, so that their values will be blank. If you have already added them via SSMS or SQL Doc, they will appear in the list.

Feed this function the name of a table, and you will get a list of all the columns, indexes and constraints (not the ones with system-generated names- we deliberately leave them out).

You can now get a list in JSON of all the tables, columns, constraints, and indexes for your database, along with your documentation with just this code which you can save and use as a parameter for the stored procedure that I’ve demonstrated earlier in the article.

The #AddOrUpdateTableDocumentation procedure is designed purely to create or update the JSON-based data. It is a bit more complicated than it would be if it was purely intended for use with a build script.

We’ll use a temporary stored procedure. If we are doing deployments, we don’t want to have anything that leaves an artefact in the database. In this version of the procedure, I’m passing in a table source in a JSON document. If you don’t have JSON, you’ll need to define a Table Valued Parameter type to do the same thing. This will leave an artefact which one wants to avoid in a build script, so it will need subsequent tidying up afterwards. A temporary procedure is very useful for builds and deployments

Now we can add or alter the documentation in the database simply by creating a JSON file that contains the data for any number of tables.

Generating Documentation Build Scripts by Automation.

Here is a PowerShell script that generates a script for every table, each in its own file without the Extended Properties. Then, it appends the output, so you end up with a tidier and more readable script. I have used an ‘inlined’ version of the dbo.ListOutTableEPs iTVF, so there are no dependencies to install before you do the scripting.

If You Are Using an Older Version of SQL Server or Don’t Like JSON

You can still add Extended Properties to a build script easily if you can’t use JSON, here is an approach that doesn’t even need a temporary procedure. The downside is that you can’t use it to edit an existing property, just for a build script.

To generate this build script from an existing version of the table with the documentation added to the Extended Properties, you can just use this procedure.

When you run this script, it will produce the build script that you can append to the output of your table build script to add the documentation as Extended Properties, in the same way we’ve just done in a PowerShell script for the JSON version.

Conclusions

The provision of Extended Properties for documentation is typical of the SQL Server features that time forgot, where the SQL Server team occasionally wax enthusiastic about a new feature but give the impression that their attention wanders like a Labrador puppy racing after a ball when you throw it another one. There is so much they could do to improve the feature but, in the meantime, it is up to us to make what there is usable. For me, the top priorities are that documentation for the components of a database must be …

  • Easy to put into source control
  • Easier to see what is there and what is missing
  • Easier to see what needs changing
  • Easy to associate them with the build script

I’d like to see Microsoft improve the feature so that the MS_Description column documentation is introduced into the table build process in SMO. It should be accessible via SSMS as inline comments after the column definition and the table documentation is added to the beginning as block comments. I’d also want the documentation for indexes to be added to the generated scripts of the index definition as comments. It isn’t hard to do and looks great.