Handling Constraint Violations and Errors in SQL Server

The database developer can, of course, throw all errors back to the application developer to deal with, but this is neither kind nor necessary. How errors are dealt with is very dependent on the application, but the process itself isn't entirely obvious. Phil became gripped with a mission to explain...

In this article, we’re going to take a problem and use it to explore transactions, and constraint violations, before suggesting a solution to the problem.

The problem is this: we have a database which uses constraints; lots of them. It does a very solid job of checking the complex rules and relationships governing the data. We wish to import a batch of potentially incorrect data into the database, checking for constraint violations without throwing errors back at any client application, reporting what data caused the errors, and either rolling back the import or just the offending rows. This would then allow the administrator to manually correct the records and re-apply them.

Just to illustrate various points, we’ll take the smallest possible unit of this problem, and provide simple code that you can use to experiment with. We’ll be exploring transactions and constraint violations

Transactions

Transactions enable you to keep a database consistent, even after an error. They underlie every SQL data manipulation in order to enforce atomicity and consistency. They also enforce isolation, in that they also provide the way of temporarily isolating a connection from others that are accessing the database at the same time whilst a single unit of work is done as one or more SQL Statements. Any temporary inconsistency of the data is visible only to the connection. A transaction is both a unit of work and a unit of recovery. Together with constraints, transactions are the best way of ensuring that the data stored within the database is consistent and error-free.

Each insert, update, and delete statement is considered a single transaction (Autocommit, in SQL Server jargon). However, only you can define what you consider a ‘unit of work’ which is why we have explicit transactions. Using explicit transactions in SQL Server isn’t like sprinkling magic dust, because of the way that error-handling and constraint-checking is done. You need to be aware how this rather complex system works in order to avoid some of the pitfalls when you are planning on how to recover from errors.

Any good SQL Server database will use constraints and other DRI in order to maintain integrity and increase performance. The violation of any constraints leads to an error, and it is rare to see this handled well.

Autocommit transaction mode

Let’s create a table that allows us to be able to make a couple of different constraint violations. You’ll have to imagine that this is a part of a contact database that is full of constraints and triggers that will defend against bad data ever reaching the database. Naturally, there will be more in this table. It might contain the actual address that relates to the PostCode(in reality, it isn’t a one-to-one correspondence).

Listing 1: Creating the PostCodetable

This means that PostCodes in this table must be unique and they must conform to a specific pattern. Since SQL Databases are intrinsically transactional, those DML (Data Manipulation Language) statements that trigger an error will be rolled back. Assuming our table is empty, try this…

Listing 2: Inserting rows in a single statement (XACT_ABORT OFF)

Nothing there, is there? It found the bad PostCodebut never got to find the duplicate, did it? So, this single statement was rolled back, because the CHECK constraint found the invalid PostCode. Would this rollback the entire batch? Let’s try doing some insertions as separate statements to check this.

Listing 3: Single batch using separate INSERT statements (XACT_ABORT OFF)

Not only doesn’t it roll back the batch when it hits a constraint violation, but just the statement. It then powers on and finds the UNIQUE constraint violation. As it wasn’t judged as a severe ‘batch-aborting’ error, SQL Server only rolled back the two offending inserts. If, however, we substitute SET XACT_ABORT ON then the entire batch is aborted at the first error, leaving the two first insertions in place. The rest of the batch isn’t even executed. Try it.

By setting XACT_ABORT ON, we are telling SQL Server to react to any error by rolling back the entire transaction and aborting  the batch.  By default, the session setting is OFF. In this case, SQL Server merely rolls back the Transact-SQL statement that raised the error and the batch continues. Even with SET XACT_ABORT set to OFF, SQL Server will choose to roll back a whole batch if it hits more severe errors.

If we want to clean up specific things after an error, or if we want processing to continue in the face of moderate errors, then we need to use SET XACT_ABORT OFF, but there is a down-side: It is our responsibility now to make sure we can return the database to a consistent state on error…and use appropriate error handling to deal with even the trickier errors such as those caused by a cancel/timeout of the session in the middle of a transaction.

Just by changing the setting of XACT_ABORT, we can rerun the example and end up with different data in the database. This is because, with XACT_ABORT ON, the behavior is consistent regardless of the type of error. It simply assumes the transaction just can’t be committed, stops processing, and aborts the batch.

 With XACT_ABORT OFF, the behavior depends on the type of error. If it’s a constraint violation, permission-denial,  or a divide-by-zero, it will plough on. If the error dooms the transaction, such as when there is a conversion error or deadlock,  it won’t. Let’s illustrate this draconian batch-abortion.

Listing 4: Inserting rows in a batch using separate INSERT statements (XACT_ABORT ON)

If you’ve got the XACT_ABORT ON then you’ll get…

You’ll see that, in the second batch, the PostCode ‘G2 9AG’ never gets inserted because the batch is aborted after the first constraint violation.

If you set XACT_ABORT OFF, then you’ll get …

And to our surprise, we can see that we get a different result depending on the setting of XACT_ABORT. (Remember that GO is a client-side batch separator!) You’ll see that, if we insert a GO after the multi-row insert, we get the same two PostCodes in . Yes, With XACT_ABORT ON the behavior is consistent regardless of the type of error. With XACT_ABORT OFF, behavior depends on the type of error

There is a great difference in the ‘abortion’ of a batch, and a ‘rollback’. With an ‘abortion’, any further execution of the batch is always abandoned. This will happen whatever you specified for XACT_ABORT. If a type of error occurs that SQL Server considers too severe to allow you to ever commit the transaction, it is ‘doomed’. This happens whether you like it or not. The offending statement is rolled back and the batch is aborted.

  Let’s ‘doom’ the batch by putting in a conversion error.

Listing 5: Single batch using separate INSERT statements with a type conversion error (XACT_ABORT OFF)

You’ll probably notice that execution of the first batch stopped when the conversion error was detected, and just that statement was rolled back. It never found the Unique Constraint error. Then the following batch…select * from PostCode…was executed.

You can combine several statements into a unit of work using wither explicit transactions or by setting implicit transactions on. The latter requires fewer statements but is less versatile and doesn’t provide anything new, so we’ll just stick to explicit transactions

So let’s introduce an explicit transaction that encompasses several statements. We can then see what difference this makes to the behavior we’ve seen with autoCommit.

Explicit Transactions

When we explicitly declare the start of a transaction in SQL by using the BEGIN TRANSACTION statement, we are defining a point at which the data referenced by a particular connection is logically and physically consistent. If errors are encountered, all data modifications made after the BEGIN TRANSACTION can be rolled back to return the data to this known state of consistency. While it’s possible to get SQL Server to roll back in this fashion, it doesn’t do it without additional logic. We either have to specify this behavior by setting XACT_ABORT to ON, so that the explicit transaction is rolled back automatically, or by using a ROLLBACK.

Many developers believe that the mere fact of having declared the start of a transaction is enough to trigger an automatic rollback of the entire transaction if we hit an error during that transaction. Let’s try it.

Listing 6: Multi-statement INSERT (single batch) using an explicit transaction

No dice. The result is exactly the same as when we tried it without the explicit transaction (see Listing 3). If we again use SET XACT_ABORT ON then the batch is again aborted at the first error, but this time, the whole unit of work is rolled back.

By using SET XACT_ABORT ON, you make SQL Server do what most programmers think happens anyway. Since it is unusual not to want to rollback a transaction following an error, it is normally safer to explicitly set it ON. However, there are times when you’d want it OFF. You might, for example, wish to know about every constraint violation in the rows being imported into a table, and then do a complete rollback if any errors happened.

Most SQL Server clients set it to OFF by default, though OLEDB sets it to ON.

Listing 7: Multi-statement INSERT (single batch) using an explicit transaction

In this batch, we execute all the insertions in separate statements, checking the volatile @@Error value. Then, we check to see whether the batch hit errors or it was successful. If it completes without any errors, we issue a COMMIT TRANSACTION to make the modification a permanent part of the database. If one or more errors are encountered, then all modifications are undone with a ROLLBACK TRANSACTION statement that rolls back to the start of the transaction.

The use of @@Error isn’t entirely pain-free, since it only records the last error, and so, if a trigger has fired after the statement you’re checking, then the @@Error value will be that corresponding to the last statement executed in the trigger, rather than your statement.

If the transaction becomes doomed, all that happens is that the transaction is rolled back without the rest of the transaction being executed, just as would happen anyway if XACT_ABORT is set to ON.

Listing 8: Multi-statement INSERT (single batch) with a doomed explicit transaction

There is a problem with this code, because I’ve issued the rollback without any qualification. If this code is called from within another transaction is will roll back to the start of the outer transaction. Often this is not what you want. I should really have declared a SavePoint to specify where to rollback to. I must explain.

Nested transactions and Savepoints

Transactions can be misleading because programmers equate them to program blocks, and assume that they can somehow be ‘nested’. All manner of routines can be called during a transaction, and some of them could, in turn, specify a transaction, but a rollback will always go to the base transaction.

Support for nested transactions in SQL Server (or other RDBMSs) simply means that it will tolerate us embedding a transaction within one or more other transactions. Most developers will assume that such ‘nesting’ will ensure that SQL Server handles each sub-transaction in an atomic way, as a logical unit of work that can commit independently of other child transactions. However, such behavior is not possible with nested transactions in SQL Server, or other RDMBSs; if the outer transaction was to allow such a thing it would be subverting the all-or-nothing rule of atomicity. SQL Server allows transactions within transactions purely so that a process can call transactions within a routine, such as a stored procedure, regardless of whether that process is within a transaction.

The use of a SavePoint can, however, allow you to rollback a series of statements within a transaction.

Without a Savepoint, a ROLLBACK of a nested transaction can affect more than just the unit of work we’ve defined . If we rollback a transaction and it is ‘nested’ within one or more other transactions, it doesn’t just roll back to the last, or innermost BEGIN TRANSACTION, but rolls all the way back in time to the start of the base transaction. This may not be what we want or expect, and could turn a minor inconvenience into a major muddle.

Listing 9: Rolling back a nested transaction without a Savepoint

As you can see, SQL Server hasn’t just rolled back the inner transaction but all the work done since the outer BEGIN TRANSACTION. You have a warning as well, 'The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION‘ because the transaction count became zero after the rollback, it successfully inserted two rows and came to the COMMIT TRANSACTION statement.

Similarly, SQL Server simply ignores all commands to COMMIT the transaction  within ‘nested’ transactions until the batch issues the COMMIT that matches the outermost BEGIN TRANSCATION.

Listing 10: Attempting to COMMIT a nested transaction without a Savepoint

The evident desire was to commit the nested transaction, because we explicitly requested that the changes in the transaction be made permanent, and so we might expect at least something to happen, but what does? Nothing; if we have executed a COMMIT TRANSACTION in a nested transaction that is contained a parent transaction that is then rolled back, the nested transaction will also be rolled back. SQL Server ignores the nested COMMIT command and, whatever we do, nothing is committed until the base transaction is committed. In other words, the COMMIT of the nested transaction is actually conditional on the COMMIT of the parent.

One might think that it is possible to use the NAME parameter of the ROLLBACK TRANSACTION statement to refer to the inner transactions of a set of named ‘nested’ transactions. Nice try,  but the only name allowed, other than a Savepoint, is the transaction name of the outermost transaction. By adding the name, we can specify that all of the nested transactions are rolled back leaving the outermost, or ‘base’, one, whereas if we leave it out then the rollback includes the outermost transaction. The NAME parameter is only useful in that we’ll get an error if someone inadvertently wraps what was the base transaction in a new base transaction, By giving the base transaction a name, it makes it easier to identify when  we want to monitor the progress of long-running queries.

We can sort this problem out by using a SavePoint. This will allow us to do quite a bit of what we might have thought was happening anyway by nesting transactions! Savepoints are handy for marking a point in your transaction. We then have the option, later, of rolling back work performed before the current point in the transaction but after a declared savepoint within the same transaction.

 Listing 11: Using Savepoints to roll back to a ‘known’ point

When we roll backto a save point, only those statements that ran after the savepoint are rolled back. All savepoints that were established later are, of course, lost.

So, if we actually want rollback within a nested transaction , then we can create a savepoint at the start. Then, if a statement within the transaction fails, it is easy to return the data to its state before the transaction began and re-run it. Even better, we can create a transaction and call a series of stored procedures which do DML stuff. Before each stored procedure, we can create a savepoint. Then, if the procedure fails, it is easy to return the data to its state before it began and re-run the function with revised parameters or set to perform a recovery action. The downside would be holding a transaction open for too long.

The Consequences of Errors.

In our example, we’re dealing mainly with constraint violations which lead to statement termination, and we’ve contrasted them to errors that lead to batch abortion, and demonstrated that by setting XACT_ABORT ON, statement termination starts to behave more like batch-abortion errors. (‘scope-abortion’ happens when there is a compile error, ‘connection-termination’ only happens when something horrible happens, and ‘batch-cancellation’ only when the client of a session cancels it, or there is a time-out ) All this can be determined from the @@Error variable but there is nothing one can do to prevent errors from being passed back to the application. Nothing, that is, unless you use TRY...CATCH

TRY CATCH Behavior

It is easy to think that all one’s troubles are over with TRY..CATCH, but in fact one still needs to be aware of other errors

Listing 12: TRY…CATCH without a transaction

This behaves the same way whether XACT_ABORT is on or off. This catches the first execution error that has a severity higher than 10 that does not close the database connection. This means that execution ends after the first error, but there is no automatic rollback of the unit of work defined by the TRY block:  No, we must still define a transaction. This works fine for most purposes though one must beware of the fact that certain errors such as killed connections or timeouts don’t get caught.

Try-Catch behavior deals with statement termination but needs extra logic to deal well with batch-abortion. In other words, we need to deal with un-committable and doomed transactions. Here is what happens if we don’t do it properly.

Listing 13: Mishandled Batch-abort

This error will immediately abort and roll back the batch whatever you do, but the TRY-CATCH seems to handle the problem awkwardly if you set XACT_ABORT ON, and it passes back a warning instead of reporting the error. Any error causes the transaction to be classified as an un-committable or ‘doomed’ transaction. The request cannot be committed, or rolled back to a savepoint. Only a full rollback  to the start of the base transaction will do. No write operations can happen until it rolls back the transaction, only reads.

If you set XACT_ABORT off, then it behaves gracefully, but terminates after the first error it comes across, executing the code in the CATCH block.

To get around this, we can use the XACT_STATE() function. This will tell you whether SQL Server has determined that the transaction is doomed.  Whilst we can use the  @@TRANCOUNT variable to detect whether the current request has an active user transaction, we cannot use it to determine whether that transaction has been classified as an uncommitable transaction. Only XACT_STATE() will tell us if the transaction is doomed, and only only @@TRANCOUNT can be used to determine whether there are nested transactions.

Listing 14: Both Statement-termination and Batch abort handled

Reaching the Goal

So now, we can have reasonable confidence that we have a mechanism that will allow us to import a large number of records and tell us, without triggering errors, which records contain bad data, as defined by our constraints.

Sadly, we are going to do this insertion row-by-row, but you’ll see that 10,000 rows only takes arount three seconds, so it is worth the wait. We have a temporary table full of 10,000 valid PostCodes, and we’ll add in a couple of rogues just to test out what happens.

Listing 15: insert from staging table with error reporting but without rollback on error

1515-img69.gif

..and if you wanted to rollback the whole import process if you hit an error, then you could try this.

Listing 16: insert from staging table with error-reporting and  rollback on error

You can comment out the rogue PostCodes or change the XACT_ABORT settings just to check if it handles batch aborts properly.

Conclusion

To manage transactions properly, and react appropriately to errors fired by constraints, you need to plan carefully. You need to distinguish the various types of errors, and make sure that you react to all of these types appropriately in your code, where it is possible to do so. You need to specify the transaction abort mode you want, and the transaction mode, and you should monitor the transaction level and transaction state.

You should be clear that transactions are never nested, in the meaning that the term usually conveys.

Transactions must be short, and only used when necessary. A session must always be cleaned up, even when it times-out or is aborted, and one must do as much error reporting as possible when transactions have to be rolled back. DDL changes should be avoided within transactions, so as to avoid locks being placed on system tables.

The application developer should be not be forced to become too familiar with SQL Server errors, though some will inevitably require handling within application code. As much as possible, especially in the case of moderate errors such as constraint violations or deadlocks should be handled within the application/database interface.

Once the handling of constraint errors within transactions has been tamed and understood, constraints will prove to be one of the best ways of guaranteeing the integrity of the data within a database.