Book Review: Defensive Database Programming With SQL Server

It distils a great deal of practical experience; the writing of it was a considerable task; It packs in a great deal of information. Alex's book shows how to write robust database applications, and we can all learn from it. We took the book to a critic who never minces his words, and were relieved to find that Joe Celko liked it.

1054-defensiveProgramming620.jpg

This is not a “should have” book; it is a “must have” book. Do not be fooled by the title. This book is not about security, SQL injection or other kinds attacks against the database. It is about robust programming. If you don’t know that term, consider a horse and dog. If your horse breaks a leg, you have to shoot it; if your dog breaks a leg, he limps until you can get him to the Vet. The dog is robust; the horse is not.

Some of the advice is generic good practices and applies to more than just SQL programming or T-SQL programming. He starts with three common misconceptions:

  • the WHERE clause conditions will always be evaluated in the same order; a common cause of intermittent query failure
  • SET and SELECT always change the values of variables; a false assumption can lead to the dreaded infinite loop
  • data will be returned in some “natural order” – another common cause of intermittent query failure.

The good stuff is the tricks to get around T-SQL problems. As an example, there is a chapter entitled “How SET ROWCOUNT can break a trigger” which goes into the internals of T-SQL triggers. While he spends a lot of time explaining how TRIGGERs work and how you can mess up things, he also shows examples of constraints to enforce data integrity. This is simply good SQL, regardless of the product.

This is not a
“should have”
book; it is a
“must have”
book

He covers the newer features that developers are not likely to know. There is a detailed example of the TRY…CATCH error handling. No, it is not quite the same as the same error handling in C# and other modern programming languages.

The section I really enjoyed was on the proprietary “UPDATE ..FROM” construct. What he calls “ambiguous results” is properly known as cardinality violation. It means that you tried to put more than one value in a column of a row. T-SQL grabs one of the values without any warning and uses it.

The recommendation is to use the MERGE statement in SQL Server 2008. But if you have to work with earlier releases of T-SQL, then you need to know what can go wrong. I had never thought about using COUNT() to detect cardinality violations and signal a transaction rollback.

There is a very good section on the MERGE statement. This is new enough that most developers don’t understand it yet. The @@ROWCOUNT and TRIGGERs can be set by the MERGE and that can make problems.

The READ_COMMITTED_SNAPSHOT or SNAPSHOT isolation modes are explained with good solid examples you can test. The mantra is that set-oriented, declarative code is better than procedural code.

I also liked the section on UDFs versus in-line code to do computations. The multi-statement table-valued UDF can force the optimizer to re-execute the UDF for each row in the table the UDF is applied against. I keep seeing more and more of this poor programming because the developers are still thinking in procedural code and want their SQL to look as much like the languages they know. Sure it runs slow, will not port and prevents optimization, but who cares, as long as it looks like BASIC?

Alex comments when a T-SQL feature is not ANSI-compliant, so that you know when you are looking at dialect. For example, the UNIQUE constraint is dialect, but you can write code around it. Likewise, NULL-able foreign key constraints can leaves orphaned rows (that means we have a referencing row in one table without a reference in the other table).

Chapter 7 is entitled “Advanced Use of Constraints” and this is should be taught in training classes. Go thru it slowly. The techniques can be generalized to other problems. But the real trick is to change your mindset.

I have minor gripes with the use of needlessly proprietary code, such as using the old Sybase ‘SELECT..UNION ALL..” construct instead of “VALUES (), ..()” to construct tables. He also has violations of the ISO-11179 data element naming rules (“Employee” versus “Personnel” and so forth). I guess you feel like you need to find something you don’t like when you do a book review. But this minor; and a five-minute job for a text editor.