Product articles SQL Prompt SQL Code Analysis
Problems with adding NOT NULL columns…

6 August 2019

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

6 August 2019

Problems with adding NOT NULL columns or making nullable columns NOT NULL (EI028)

Phil Factor explains the problems you might encounter when adding a non-nullable column to an existing table or altering a column that contains NULL values to be non-nullable. He demos a migration script that can deploy such changes safely. You might also learn that in an archaic form of the Scots language, used in Cumberland, the number 17 is "tiny bumfit"; I think the tiny bumfit bus goes from Penrith to Carlisle.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

It is a common ritual when designing a database to add or remove NULL constraints, but there are a couple of problems that can cause you grief when you are making changes to already-populated tables. This can happen when you try to add a new column that can’t accept NULL values, or to change an existing, nullable column into a NOT NULL column. SQL Prompt will warn you (EI028) if it detects code that will attempt to add a NOT NULL column to an existing table, without specifying a default value.

I’ll demonstrate these problems and then show you how to develop build scripts that apply these sorts of alterations. I’ll show how these can work regardless of whether you’re building a new version of the table from scratch, with the alterations, or if you need to update an existing table so that it incorporates these changes.

Adding a NOT NULL column to a populated table

We have a table, CountingWords, in which we record the words used to count, in Old Welsh. We have ambitions to count all the way up to 20, but currently only know how to count to 10.

/* we create a table. Just for our example, we create the words
used to count from one to ten in old Welsh.
(Na, nid wyf yn siaradwr Cymraeg ond rwy'n hoffi gwneud 
rhywfaint o ymchwil ieithyddol gyda'r nos)
I'd like to do from eleven to twenty as well eventually
so I'll add them and leave them NULL.
Here is the initial build script, with guard clauses
of course. We'll re-create every time it is run. 
With a few modifications we can make it so it only runs once 
which is safer if you have reckless colleagues in your shop.
 */
IF Object_Id('dbo.CountingWords') IS NOT NULL DROP TABLE dbo.CountingWords;
--we script version 1 of our table of counting words
CREATE TABLE dbo.CountingWords
  (
  TheValue INT NOT NULL,
  Word NVARCHAR(30) NULL,
  CONSTRAINT CountingWordsPK PRIMARY KEY (TheValue)
  );
GO
INSERT INTO dbo.CountingWords (TheValue, Word)
VALUES
  (1, 'Un'),  (2, 'Dau'),  (3, 'Tri'),  (4, 'Pedwar'),
  (5, 'Pump'),  (6, 'Chwech'),  (7, 'Saith'),  (8, 'Wyth'),
  (9, 'Naw'),  (10, 'Deg'),  (11, NULL),  (12, NULL),
  (13, NULL),  (14, NULL),  (15, NULL),  (16, NULL),
  (17, NULL),  (18, NULL),  (19, NULL),  (20, NULL);
GO

Listing 1: Un, Dau, Tri – version 1 of the CountingWords table

Having released the first version of this table, we realize quickly that we really should have recorded the name of the language, so we alter the design of the table to add a TheLanguage column, which cannot accept NULLs.

ALTER TABLE dbo.CountingWords ADD TheLanguage NVARCHAR(100) NOT NULL;

Immediately SQL Prompt warns us of the danger:

If we ignore SQL Prompt’s warnings and execute this, we will get an error.

Msg 4901, Level 16, State 1, Line 34
ALTER TABLE only allows columns to be added that can contain nulls, or have a DEFAULT definition specified, or the column being added is an identity or timestamp column, or alternatively if none of the previous conditions are satisfied the table must be empty to allow addition of this column. Column 'TheLanguage' cannot be added to non-empty table 'CountingWords' because it does not satisfy these conditions.

The error message is explicit, and it is easily fixed by defining a DEFAULT constraint so that SQL Server can insert a default value for this new column, for each row.

ALTER TABLE dbo.CountingWords ADD TheLanguage NVARCHAR(100) NOT NULL
          DEFAULT 'Old Welsh';

Listing 2: Specifying a default when adding a NOT NULL column

Put simply, if a column is being added, and NULL values aren’t allowed, then you must provide a value to put into every row. As our table only had one language right now, ‘old welsh’, that wasn’t too hard.

Of course, we’ll want to record how to count in other languages too, such as Manx, Cornish or Cumbrian, so to enforce some data integrity, we need instead to create a parent table called Location that defines each of the languages, and where they were first recorded.

Let’s get that out the way. For this table, we can’t just drop and recreate the table without data-loss, and we’ll get an error anyway once our CountingWords table references this table via a FOREIGN KEY constraint. We need to take precautionary steps. I’ll use a simple, but rather strange, technique that ensures there is no damage, in terms of data loss, if the code is re-run. This script is obliged to run as several batches because CREATE TABLE statements must be at the start of a batch, and it is hard to execute code conditionally across batches.

IF Object_Id('dbo.Location') IS NOT NULL SET NOEXEC ON;
 --cunning way of only executing a section
--of code on a condition. until the next SET NOEXEC OFF
--we script version 1 of our table of counting words
GO
--sadly the create table statement has to be at the start of a batch
CREATE TABLE dbo.Location
  (
  TheLanguage NVARCHAR(30) NOT NULL,
  Description VARCHAR(100) NOT NULL DEFAULT '',
  CONSTRAINT LanguageKey PRIMARY KEY (TheLanguage)
  );
--now we insert the row we need for our existing data
INSERT INTO dbo.Location (TheLanguage) VALUES ('Old Welsh');
GO
SET NOEXEC OFF;

Listing 3: Version 1 of the Location table

Of course, now we also need to modify the CountingWords table so that its TheLanguage column is a FOREIGN KEY, referencing the new Location table, but we’ll deal with that a little later.

Altering a nullable column to make it non-nullable

Quickly, we decide that allowing NULL values in TheWord column of CountingWords was a design mistake that we want to fix. We already learned that, if the table contains data, SQL Server won’t let us make a column non-nullable unless we provide a default value for it, in this case just a blank string.

ALTER TABLE CountingWords ADD CONSTRAINT WordConstraint DEFAULT '' FOR Word;
ALTER TABLE CountingWords ALTER COLUMN Word NVARCHAR(30) NOT NULL;
Msg 515, Level 16, State 2, Line 58
Cannot insert the value NULL into column 'Word', table 'PhilFactor.dbo.CountingWords'; column does not allow nulls. UPDATE fails.
The statement has been terminated.

Listing 4: Failed attempt to make the Word column NOT NULL

Aiee! We still can’t make column non-nullable, even though we’ve told SQL Server what to insert for NULL columns! First, we must remove any existing NULLs explicitly, by updating all the rows with the default value:

UPDATE CountingWords SET Word = DEFAULT WHERE Word IS NULL;
ALTER TABLE CountingWords ALTER COLUMN Word NVARCHAR(30) NOT NULL;

Listing 5: Updating existing rows with the default value before making a column NOT NULL

So, that worked fine.

Rolling out all the changes

It’s time to roll out all these changes to people who only have version 1 of the design, where there was no Location table, and where CountingWords had no TheLanguage column and a nullable Word column.

Rolling out the new Location table, since it’s the first version of that table, is no real problem (see Listing 3). However, rolling out the changes to the new CountingWords table requires adding a TheLangauge column, which does not allow NULLs, and changing the Word column to be NOT NULL, in both cases avoiding the problems we already discussed.

We also want our migration script to work regardless of whether we’re updating an existing v1 of CountingWords, or we need to build v2 of the table from scratch. Also, we don’t want the script to cause any harm or trigger an error if it’s accidentally re-run.

As a bonus, in either case, we’ll also need to make the TheLangauge column in CountingWords a FOREIGN KEY that auto-updates in response to updates or deletes on the parent key, as well as alter the PRIMARY KEY. As a final step, we’ll add in the Old Welsh words for 11-20, which we previously didn’t know.

Here is the migration script that will either migrate CountingWords from v1 to v2 or else create v2 from scratch, and which will cause no harm if accidentally re-run. Before you try it, either drop the CountingWords table, or rerun Listing 1 to reestablish v1 of the table.

--we now script version 2
IF Object_Id('dbo.CountingWords') IS NULL
  BEGIN
    --we script version 2 of our table of counting words if it
    --doesn't already exist
    CREATE TABLE dbo.CountingWords
      (
      TheValue INT NOT NULL,
      Word NVARCHAR(30) NOT NULL CONSTRAINT WordConstraint DEFAULT '',
      TheLanguage NVARCHAR(30) NOT NULL 
        CONSTRAINT LanguageConstraint REFERENCES dbo.Location(TheLanguage)
          ON DELETE CASCADE
          ON UPDATE CASCADE 
       CONSTRAINT CountingWordsPK PRIMARY KEY(TheValue, TheLanguage)
      );
  END;
ELSE /* else we need to add a column and change the primary key
     constraint */
  BEGIN
    IF NOT EXISTS -- only run if the column does not exist
      (
      SELECT *  FROM sys.columns
        WHERE name LIKE 'TheLanguage'
          AND object_id = Object_Id('dbo.CountingWords')
      )
      BEGIN 
        -- first we need to add the language column
        ALTER TABLE CountingWords ADD TheLanguage NVARCHAR(30) NOT NULL
          DEFAULT 'Old Welsh' CONSTRAINT LanguageConstraint 
            REFERENCES dbo.Location(TheLanguage)
             ON DELETE CASCADE
             ON UPDATE CASCADE ;
      END
       --now we need to alter the primary key
      ALTER TABLE CountingWords DROP CONSTRAINT CountingWordsPK;
      ALTER TABLE CountingWords --and add the new version
        ADD CONSTRAINT CountingWordsPK 
          PRIMARY KEY(TheValue, TheLanguage);
      IF NOT EXISTS -- do we need to add the default and remove the nulls?
        (SELECT * FROM sys.default_constraints 
          WHERE name LIKE 'WordConstraint')
        BEGIN
          ALTER TABLE CountingWords ADD CONSTRAINT WordConstraint DEFAULT '' FOR Word;
/* You can specify NOT NULL in ALTER COLUMN only if the column contains no null values. 
The null values must be updated to some value before the ALTER COLUMN NOT NULL is allowed,*/
          UPDATE CountingWords SET Word = DEFAULT WHERE Word IS NULL;
        END;
    IF NOT EXISTS --now finally we can make it not null 
        (SELECT * FROM sys.columns 
           WHERE name LIKE 'word' AND is_nullable = 0)    
       ALTER TABLE CountingWords ALTER COLUMN Word NVARCHAR(30) NOT NULL;
  END;
GO
IF EXISTS --do we need to add in the welsh words we didn't know 
  (SELECT * FROM dbo.CountingWords
  WHERE TheLanguage LIKE 'Old Welsh' AND word LIKE ''
  )
  --yes we need to add those words to replace those pesky blanks
  UPDATE CountingWords
    SET Word = welsh.word
    FROM CountingWords AS cw
      INNER JOIN
        (
        VALUES ('Un ar ddeg', 11), ('Deuddeg', 12), ('Tri ar ddeg', 13),
          ('Pedwar ar ddeg', 14), ('Pymtheg', 15), ('Un ar bymtheg', 16),
          ('Dau ar bymtheg', 17), ('Deunaw', 18), ('Pedwar ar bymtheg', 19),
          ('Ugain', 20)
        ) AS welsh (word, meaning)
       ON welsh.meaning = cw.TheValue;

Listing 6: A safe migration script for v2 of CountingWords

If it wasn’t for trying to ensure that the migration script worked in all circumstances, it would have been a lot simpler. I just hate build scripts that can only be run in particular circumstances.

Let’s now test it out by adding the counting words from one to twenty in a different language/region:

DECLARE @Language NVARCHAR(30) = 'West Cumbrian';
IF NOT EXISTS 
  (SELECT * FROM dbo.Location 
  WHERE TheLanguage LIKE @Language
  ) INSERT INTO dbo.Location (TheLanguage) 
      VALUES (@Language);
/* and now we can add in any other ways of counting up to 
twenty that we want */
IF NOT EXISTS (SELECT * FROM CountingWords WHERE Thelanguage LIKE @Language)
INSERT INTO CountingWords (TheValue, Word, TheLanguage)
  SELECT TheValue, word, @Language
    FROM
      (
      VALUES ('yan', 1), ('tyan', 2), ('tethera', 3), ('methera', 4),
        ('pimp', 5), ('sethera', 6), ('lethera', 7), ('hovera', 8),
        ('dovera', 9), ('dick', 10), ('yan-a-dick', 11), ('tyan-a-dick', 12),
        ('tethera-dick', 13), ('nethera-dick', 14), ('bumfit', 15),
        ('yan-a-bumfit', 16), ('tyan-a-bumfi t', 17), ('tithera-bumfit', 18),
        ('methera-bumfit', 19), ('giggot', 20)
      ) AS f (word, TheValue);
GO

Listing 7: Yan, Tyan, Tethera – counting to 20 in West Cumbria

Now I’ve changed my mind. It shouldn’t be called ‘Old Welsh’, but ‘Archaic Welsh’. What a shame, but then we can now test out our foreign key constraint.

UPDATE dbo.Location SET TheLanguage = 'Archaic Welsh' WHERE TheLanguage LIKE 'Old Welsh' 
SELECT * FROM location
SELECT * FROM CountingWords

Listing 8: Cascading updates after changing the language

As if by magic, all the references have changed. I now have a database I can use!

I’ve provided a FillCountingWordsTable script, containing a full set of counting words for a variety of 47 recorded locations and languages, that you can use for more extensive testing. Apologies to Stateside friends and relatives that I left out the several Indian tribes who used the same counting rhymes. It turned out on investigation that they’d been taught knitting by the British colonists, and they’d thought that the words used to count the stitches were part of the magic.

Conclusion

We’ve just set up a rather elaborate demonstration of how to avoid some of the problems of altering tables that are already populated with data. In this example, the problems involve the use of NULL values, and happen when you try to add a new column that can’t accept NULL values, or to change an existing, nullable column into a NOT NULL column, when there are existing NULL values in the column. In the first case, you simply add a DEFAULT constraint to the column first, with a value that isn’t NULL, and in the second case you remove the NULL values first by updating the table.

These are techniques for making changes to existing tables, which is why I elaborated the demonstration to illustrate how to go about doing both of these operations as part of a resilient script that can be run regardless of whether it is a fresh build or a migration, and that can be re-run without any detrimental effects.

Share this post.

You may also like