23 April 2019
23 April 2019

Insert Statement Without Column List (BP004)

Many production databases have failed embarrassingly as a result of INSERT code that omits a column list, usually in mysterious ways and often without generating errors. Phil Factor demonstrates the problem, and advocates a 'defense-in-depth' approach to writing SQL, in order to avoid it.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

You can insert rows into a table without providing a column list, but this has no valid use at all. As a rule, you should avoid all temptation to make your SQL code shorter by leaving things out. There are always exceptions to every rule, of course, but the INSERT column list is so essential that I’d advise you to stop the build if SQL Prompt (BP004), or whatever static code-analysis tool you use, detects such a crime. At the same time, I’d advise a defensive and pessimistic attitude to writing SQL code that would naturally rear up at the idea of ‘chancy’ code that leaves out column lists.

The Perils of the Blind Insert

Imagine that we have a Purchase table, and then tucked away in a stored procedure far away in the database code, some code that inserts into it the results from a derived table. Here, we’ll just insert some data from the ever-obliging AdventureWorks2016.

There was a time when there was no check that the number of columns in the table source (the derived table to be loaded) was compatible with the number in the destination table. Nowadays, the number of columns in the source must be compatible with the columns in the table or in column_list. However, there is plenty of room for chaos if someone changes the purpose of a column or changes the order of columns.

Changing the purpose of a column

To demonstrate that, we must imagine that the team leader suddenly realizes that that total column is redundant, and that he needs, instead, the ShipDate column.

Unfortunately, he forgot about the INSERT routine tucked away in a stored procedure. When it runs, it triggers no errors, but in the Purchase table you’ll find nastiness.

Some of those shipping dates ought to raise eyebrows. What’s happened? The value for total, which is a money datatype, has become a date. How? Two points here. Some developers think that SQL Server will check the column names of the result set from the table source against the column names of the destination table. No, it doesn’t. What is worse, if the data types of the columns in the source and destination don’t match, SQL code will do its best to obligingly convert the former into the latter, within the strict rules about when implicit conversion is allowed.

This is what has happened here: an implicit conversion turned data that is intended as a sum of money into a date. We can show it even more simply:

If you were to try it the other way around (popping a date into a money column), then you’d get an error because there is a rule disallowing this implicit conversion.

Msg 257, Level 16, State 3, Line 28
  Implicit conversion from data type datetime to numeric is not allowed. Use the CONVERT function to run this query.

As the error informs us, we’d need to use an explicit conversion to convert a date into a sum of money

Changing column order

We’ve shown that the INSERT statement without a column list, the ‘blind insert’, is vulnerable to changes in a table’s columns, but even getting the column order wrong can cause calamity. Worse, you only get to discover this if the data doesn’t reconcile, because it can happen without triggering an error.

If the column into which you accidentally insert data is the same datatype as the intended column, only a well-constructed CHECK constraint would ever be able to save you from calamity. To demonstrate the point, we’ll create a journal table.

Now we add in some journal entries using a blind insert:

We can now see our balance.

In journal tables, entries are assigned to specific accounts using a FOREIGN KEY, and the journal keeps track of many accounts. In most countries, journal entries are entered in chronological order and debits are entered before credits. As such, someone decides that the credit (Cr) column ought to be before the debit (Dr) column:

In the absence of a column list, the INSERT statement simply assumes that the order of the columns in the VALUES clause matches the order of the columns in the table. If some developer switches the order of the columns, the routine will still work fine, in this case, but the books won’t balance, and the office will be full of grim-faced people in black shoes and charcoal-grey suits.

There is nothing here to even indicate the order of the values in the VALUES statement, so it will take a while to spot the problem. Worst of all, you’ll get the blame rather than the dev who changed the order of the columns. Not specifying column names is fine for interactive work, but if you write code that relies on the hope that nothing will ever change, then refactoring could prove to be impossible.

Adding a column list will make it clear what column of the result set goes into which column of the target table, but how can you be certain that the order in the multi-row VALUES clause, or whatever other table-source you are using, matches that of the column list? Here, just as a suggestion, is a more defensive and maintainable way of doing it, by not only adding a column list but also documenting the intended order of the VALUES clause.

The extra specifications of the column lists do nothing more than to emphasize what the table source intended each column to be, and it is easy to check by eye that they actually do so. It is more like documentation.

Defense in depth: constraints

As well as the obvious defense against this sort of thing happening, which is to specify the list of columns in order, you need constraints. Omitting them because you feel certain that they could never throw an error is like reasoning that smoke detectors aren’t necessary because they are so seldom triggered by smoke.

Let’s take that first example, the purchase table. What’s missing? The constraints, of course. The problem was bizarre and obvious, and the purchase table should have been liberally sprinkled with constraints in order to prevent it.

Now we test it out to see what happens

And of course, the alarm bells ring out immediately:

Msg 547, Level 16, State 0, Line 31
  The INSERT statement conflicted with the CHECK constraint "Shipdate_Before_Orderdate". The conflict occurred in database "business", table "dbo.Purchase".
  The statement has been terminated.

When someone makes a mistake like this, it is much better to allow it to trigger an error in testing than to leave open even a small chance it will get into production use.

The constraints in the journalEntries table would be much more dependent on the business rules in place for the business, and it is more usual for a trigger to pick up an unusual deviation from a baseline for the balance.

Summary

Just because you can leave out the column list in the INSERT statement, it doesn’t mean you should. It will introduce a fragility into your code that at some time will catch out either you or, more importantly, whoever must cope with your code, subsequently, as it’s likely to fail in ways that are hard to predict or to detect retrospectively.

After many years of experience, I’ve seen the most extraordinarily unlikely things to go wrong. The wildest and most fantastic CHECK constraint conditions get triggered, and exceptions that seem impossible will always get their day of glory, written in red letters on your screen.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Share this post.

Share on FacebookShare on Google+Share on LinkedInTweet about this on Twitter

You may also like

  • Article

    How to record T-SQL execution times using a SQL Prompt snippet

    Phil Factor shares a SQL Prompt snippet called timings, which he uses as a standard testbed for getting execution times for procedures and functions.

  • Article

    Finding code smells using SQL Prompt: TOP without ORDER BY in a SELECT statement (BP006)

    Using TOP in a SELECT statement, without a subsequent ORDER BY clause, is legal in SQL Server, but meaningless because asking for the TOP x rows implies that the data is guaranteed to be in a certain order, and tables have no implicit logical order. You must specify the order.

  • Article

    How to compare two DacPacs using SQL Compare

    Feodor shows how to use Data-Tier Application Packages (DacPacs), together with the Data-Tier Application Framework (DacFx), as an alternative way of scripting out a database, and deploying database changes, in cases where SQL Compare can't get direct access to the production database.

  • Article

    Monitoring SQL Server Security: What's Required?

    If we know how a database is likely to be attacked, we can arrange our lines of defense, and install the monitors required to detect any attempts. However, some types of attack are difficult to imagine, so we also need our monitoring tool to be adaptable, so that it can collect a more diffuse collection of metrics, and then help us determine the reason for any sudden change in the patterns of access .

  • University

    Take the SQL Prompt course

    This Redgate University course takes you from installation all the way up to getting the most out of the advanced operations in SQL Prompt. As well as autocompleting your code, you’ll learn how SQL Prompt can help you with code formatting, object renaming, code analysis, and other useful tip and tricks.

  • Forums

    SQL Prompt Forum

    Write, format, and refactor SQL effortlessly