NOT NULL Persisted Computed Columns (And What They Allow)

Ok, so I have been, and will mostly stay, in a bit of blog hibernation. I am super busy trying to finish up a rather large number of tasks (a few chapters in a book, 2 more SQL Saturdays, and as much theme park time as I can get in before the end of June, when I am having my knee replaced…) But as I was working today I discovered something that I wanted to throw together a blog on because I doubt most of my readers will know about this feature, and I wanted to try it out to the fullest. I had not seen this before in the course of writing database design/implementation books, coding many databases, etc for 20 years (ok, more than 20 years, since my first book on SQL Server Database Design was 19 years ago!)

This started out to be a quick, simple article, but it grew long to show all the interesting if possibly silly techniques that you can do with NOT NULL persisted computed columns.

I have for a long time known that persisted columns can be used in UNIQUE constraints and indexes. They never needed to be declare as persisted, because the engine would persist them for you when the index was added. I regularly use PERSISTED computed columns because they are important for performance, especially when used in reporting. What I had no idea of was that they could be used with other types of constraints. I am not at all suggesting at this point that you would want to employ every one of the techniques presented, or even that they are all good ideas (in fact, one of these ideas I will present is kind of horrifying from a proper design standpoint.) The way I see it, things you can do with code are like tools you find in that big toolbox you inherited. You play around with them, get a feel for how they work, but mostly have no clue what they are good for until you run into that need.

It is just good to know what “can” be done in code, and this is the point of my post today. Here are some tools you probably didn’t realize you had in your toolbox, and one day you may find a good reason to use them. (And ideally comment on my post with those ideas for the rest of us.)

Note that the examples in this article (other than simple computed columns) only work with persisted computed columns, if you use a non-persisted type, you will receive the following error:

In the following block of code, I will create a table to use for the first set of examples.

Now say we want to make a computed column on the Value column, that upper cases the value. We might add:

If you query the catalog, you can see that the column allows NULL values, which seems obvious since the source column allows NULL values.

This returns:

The source column allows NULL values, so the computed does too. Let’s add another ValueUpperFormatted, that is definitely NOT NULL because we will base it on a COALESCEd value.

Even though this clearly could not allow NULL values, query the metadata, and you will see it still allows NULL values.

We can fix this by declaring that the column is NOT NULL in the creation of the computed column (this was the piece of information I did not realize before today, and I learned it because I noticed it in a script that RedGate’s SQL Compare created.):

Check the metadata again, and you will see that ValueUpperFormatted now will not allow NULL values. 

Which returns:

Let’s clear out the data in the table:

And now, what if we define another computed column that clearly could be have a NULL result, but define it as NOT NULL;

This works, and the metadata indeed says the column does not allow NULL values, so if you try to add data to the table that has a NULL value:

You get what would be a bit of a head-scratcher (I am personally glad this is not how I learned of this feature, or it would have driven me a little nuts):

On the other hand, making a computed column not allow NULL values is a fantastic tool for the database designer to know that the expression they have crafted does not match that they expected it to never output a NULL value (and in my opinion, it is better to catch these things BEFORE 2.5 million rows have been inserted that don’t meet the requirements you have set up.)

Next, let’s add a check constraint our computed column. For this example, we are just going to make sure that the value in the table is a palindrome (because this is something that every data architect has come across at least one in their life, right?). So Value = REVERSE(Value);

We don’t need to check for NULL values, if the Boolean expression is TRUE or NULL, the operation will succeed.

This throws the following error:

But the following code:

Succeeds. Using the UPPER value, lets this work with case sensitive data as well, so that might be useful. In reality, if you put anything semi-complex in the computed column, based on multiple other columns, it might be good to make sure that the output is in some allowed range. For example, say you had a complex CASE expression, you could have NULL as the default, and make the column fail on a NULL value, but that could be more confusing than using a CHECK constraint with at least a descriptive name to point out the invalid value.

—————-

Speaking of confusing, for fun, there are two other types of things we can do with a persisted computed column that are pretty interesting, if perhaps at least somewhat impractical.This is using one as a PRIMARY KEY or a FOREIGN KEY.

In the following table, we will let the user enter an integer value for the base value of the key, then use a persisted computed column to format the value and use this as the PRIMARY KEY value. The datatype of a computed column can be interesting based on whatever the source data is, so we need to make the size exactly defined for when we associate the foreign key, hence the cast to nchar(6)):

You can see that the data is entered and is formatted:

Finally, let’s do a really sort of scary example of a way to implement a FOREIGN KEY constraint on a column that contains values that could be for multiple tables. WARNING: This example feels completely useless as a “real world” technique to me. Using the toolbox analogy, this is akin to that proprietary wrench you got in a box of cereal when you were 10 and left it in the toolbox. Probably useless, but a tool, nonetheless.

We will create another domain table for another type:

So now, someone has mixed Type1 and Type2 data in a column, and you cannot change that about your data, but you want to validate it (and doing this in a constraint is usually better, plus, in the final output you will have the “proper” versions of the columns in your table as well.

This returns the following:

Since we know that the algorithm to tell the source values apart is based on a prefix (it might be another column, or maybe a more clever parsing system than 1 for Type1, also). We can add:

Again, note: this is not a good design. Seriously. If you are making another human being do this because of how you implemented a databases, it is not a good thing.  Next, add FOREIGN KEY constraints to the two columns:

Now, through a bit of a roundabout manner, we are able to validate that the data in the TypeId column is either that of a row in Type1 or Type2 (additional work would be required to make sure that the value in TypeId follows the expected formats, for example, adding a check constraint that makes sure either Type1Id or Type2Id are not both NULL.

Now you can test it out by trying to put in invalid data, such as a value prefixed with a 3.

This gives you an error in the check constraint:

If the data is formatted correctly with a prefix of 1:

This gives you an error with the foreign key constraint. Prefix the invalid value with a 2, and you will see the error with the other FOREIGN KEY constraint. Put in a proper value:

Success. (Well, it works, you still are stuck with some really weird code to deal with!)