Product articles
SQL Prompt
SQL Code Analysis
Problems with adding NOT NULL columns…

Problems with adding NOT NULL columns or making nullable columns NOT NULL (EI028)

Phil Factor explains the problems you might encounter when adding a non-nullable column to an existing table or altering a column that contains NULL values to be non-nullable. He demos a migration script that can deploy such changes safely. You might also learn that in an archaic form of the Scots language, used in Cumberland, the number 17 is "tiny bumfit"; I think the tiny bumfit bus goes from Penrith to Carlisle.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

It is a common ritual when designing a database to add or remove NULL constraints, but there are a couple of problems that can cause you grief when you are making changes to already-populated tables. This can happen when you try to add a new column that can’t accept NULL values, or to change an existing, nullable column into a NOT NULL column. SQL Prompt will warn you (EI028) if it detects code that will attempt to add a NOT NULL column to an existing table, without specifying a default value.

I’ll demonstrate these problems and then show you how to develop build scripts that apply these sorts of alterations. I’ll show how these can work regardless of whether you’re building a new version of the table from scratch, with the alterations, or if you need to update an existing table so that it incorporates these changes.

Adding a NOT NULL column to a populated table

We have a table, CountingWords, in which we record the words used to count, in Old Welsh. We have ambitions to count all the way up to 20, but currently only know how to count to 10.

Listing 1: Un, Dau, Tri – version 1 of the CountingWords table

Having released the first version of this table, we realize quickly that we really should have recorded the name of the language, so we alter the design of the table to add a TheLanguage column, which cannot accept NULLs.

Immediately SQL Prompt warns us of the danger:

If we ignore SQL Prompt’s warnings and execute this, we will get an error.

Msg 4901, Level 16, State 1, Line 34
ALTER TABLE only allows columns to be added that can contain nulls, or have a DEFAULT definition specified, or the column being added is an identity or timestamp column, or alternatively if none of the previous conditions are satisfied the table must be empty to allow addition of this column. Column 'TheLanguage' cannot be added to non-empty table 'CountingWords' because it does not satisfy these conditions.

The error message is explicit, and it is easily fixed by defining a DEFAULT constraint so that SQL Server can insert a default value for this new column, for each row.

Listing 2: Specifying a default when adding a NOT NULL column

Put simply, if a column is being added, and NULL values aren’t allowed, then you must provide a value to put into every row. As our table only had one language right now, ‘old welsh’, that wasn’t too hard.

Of course, we’ll want to record how to count in other languages too, such as Manx, Cornish or Cumbrian, so to enforce some data integrity, we need instead to create a parent table called Location that defines each of the languages, and where they were first recorded.

Let’s get that out the way. For this table, we can’t just drop and recreate the table without data-loss, and we’ll get an error anyway once our CountingWords table references this table via a FOREIGN KEY constraint. We need to take precautionary steps. I’ll use a simple, but rather strange, technique that ensures there is no damage, in terms of data loss, if the code is re-run. This script is obliged to run as several batches because CREATE TABLE statements must be at the start of a batch, and it is hard to execute code conditionally across batches.

Listing 3: Version 1 of the Location table

Of course, now we also need to modify the CountingWords table so that its TheLanguage column is a FOREIGN KEY, referencing the new Location table, but we’ll deal with that a little later.

Altering a nullable column to make it non-nullable

Quickly, we decide that allowing NULL values in TheWord column of CountingWords was a design mistake that we want to fix. We already learned that, if the table contains data, SQL Server won’t let us make a column non-nullable unless we provide a default value for it, in this case just a blank string.

Msg 515, Level 16, State 2, Line 58
Cannot insert the value NULL into column 'Word', table 'PhilFactor.dbo.CountingWords'; column does not allow nulls. UPDATE fails.
The statement has been terminated.

Listing 4: Failed attempt to make the Word column NOT NULL

Aiee! We still can’t make column non-nullable, even though we’ve told SQL Server what to insert for NULL columns! First, we must remove any existing NULLs explicitly, by updating all the rows with the default value:

Listing 5: Updating existing rows with the default value before making a column NOT NULL

So, that worked fine.

Rolling out all the changes

It’s time to roll out all these changes to people who only have version 1 of the design, where there was no Location table, and where CountingWords had no TheLanguage column and a nullable Word column.

Rolling out the new Location table, since it’s the first version of that table, is no real problem (see Listing 3). However, rolling out the changes to the new CountingWords table requires adding a TheLangauge column, which does not allow NULLs, and changing the Word column to be NOT NULL, in both cases avoiding the problems we already discussed.

We also want our migration script to work regardless of whether we’re updating an existing v1 of CountingWords, or we need to build v2 of the table from scratch. Also, we don’t want the script to cause any harm or trigger an error if it’s accidentally re-run.

As a bonus, in either case, we’ll also need to make the TheLangauge column in CountingWords a FOREIGN KEY that auto-updates in response to updates or deletes on the parent key, as well as alter the PRIMARY KEY. As a final step, we’ll add in the Old Welsh words for 11-20, which we previously didn’t know.

Here is the migration script that will either migrate CountingWords from v1 to v2 or else create v2 from scratch, and which will cause no harm if accidentally re-run. Before you try it, either drop the CountingWords table, or rerun Listing 1 to reestablish v1 of the table.

Listing 6: A safe migration script for v2 of CountingWords

If it wasn’t for trying to ensure that the migration script worked in all circumstances, it would have been a lot simpler. I just hate build scripts that can only be run in particular circumstances.

Let’s now test it out by adding the counting words from one to twenty in a different language/region:

Listing 7: Yan, Tyan, Tethera – counting to 20 in West Cumbria

Now I’ve changed my mind. It shouldn’t be called ‘Old Welsh’, but ‘Archaic Welsh’. What a shame, but then we can now test out our foreign key constraint.

Listing 8: Cascading updates after changing the language

As if by magic, all the references have changed. I now have a database I can use!

I’ve provided a FillCountingWordsTable script, containing a full set of counting words for a variety of 47 recorded locations and languages, that you can use for more extensive testing. Apologies to Stateside friends and relatives that I left out the several Indian tribes who used the same counting rhymes. It turned out on investigation that they’d been taught knitting by the British colonists, and they’d thought that the words used to count the stitches were part of the magic.

Conclusion

We’ve just set up a rather elaborate demonstration of how to avoid some of the problems of altering tables that are already populated with data. In this example, the problems involve the use of NULL values, and happen when you try to add a new column that can’t accept NULL values, or to change an existing, nullable column into a NOT NULL column, when there are existing NULL values in the column. In the first case, you simply add a DEFAULT constraint to the column first, with a value that isn’t NULL, and in the second case you remove the NULL values first by updating the table.

These are techniques for making changes to existing tables, which is why I elaborated the demonstration to illustrate how to go about doing both of these operations as part of a resilient script that can be run regardless of whether it is a fresh build or a migration, and that can be re-run without any detrimental effects.

Tools in this post

SQL Prompt

Write, format, and refactor SQL effortlessly

Find out more