Snapshot Isolation Level and Concurrent Modification Collisions - On Disk and In Memory OLTP

This is part 1 of my In Memory OLTP blogs for my upcoming book project.

In this blog I want to take a (reasonably) quick look at how concurrent modifications are handled using SNAPSHOT isolation level. There are two different flavors of SNAPSHOT Isolation level you can deal with, and while they are similar, there are some very big differences internally, as well as how they work when you interact with them.

Generally speaking, SNAPSHOT isolation level is meant to be an implementation of an optimistic concurrency control system. Optimistic concurrency control resolves isolation issues during the execution of a transaction, watching for cases where one transaction may be affected by another by letting it occur, and checking during the COMMIT of the transaction or in some cases, cancelling the transaction during the execution. This is unlike the lock based concurrency based isolation mechanism we have grown to love/hate.

In SQL Server 2005, Microsoft implemented SNAPSHOT isolation level for on disk tables that used a limited version of optimistic concurrency control, in that it still used locks to isolate certain operations from one another. Now, in SQL Server 2014 and beyond, they have implemented a more complete version of SNAPSHOT based isolation that behaves in a manner that allows for no locks or latches to be involved, but significantly changes how you as a programmer needs to handle possible collisions.

In this blog, I will demonstrate how isolation is implemented for simple inserts and updates that want to insert duplicated data or update the same row. In SQL Server 2014, the lone unique index/primary was the only real location that a collision could occur, but in 2016, they are hopefully adding FOREIGN KEYs and more than one uniqueness constraint per table. Collisions that arise from these constraints will almost certainly follow the patterns that are shown from the current crop of collisions.

The scripts were executed on:

SELECT @@version

Microsoft SQL Server 2014 – 12.0.2269.0 (X64); Jun 10 2015 03:35:45 ; Copyright (c) Microsoft Corporation; Developer Edition (64-bit) on Windows NT 6.3 <X64> (Build 10240: )

Using a database that has been configured for in memory OLTP (my script can be found here), I start by creating the following two tables:

create schema demo;
go

create table demo.OnDisk
(
onDiskId int CONSTRAINT PKOnDisk PRIMARY KEY,
value varchar(20)
);
go

create table demo.InMem
(
    InMemId int NOT NULL,
    value varchar(20) NOT NULL,
    CONSTRAINT XPKInMem PRIMARY KEY NONCLUSTERED HASH
    ( InMemId) WITH ( BUCKET_COUNT = 20), –small table!
) WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA );

On Disk SNAPSHOT

To show how collisions are handled using on disk tables in snapshot isolation level, I will execute the following code on one connection. Note that in all cases I will be using an explicit transaction for the demonstrations to “pause” time. The same effects are seen in implicit and very fast running transactions, but they are MUCH harder to demo. If you want to see the effects of concurrency in “real time”, build an example using Adam Machanic’s SQL QUERY Stress http://www.datamanipulation.net/SQLQueryStress/, I used it when developing my in memory presentation, and given data sets with duplications and lots of concurrency, you can see the effects occur:

–Connection1
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
go

BEGIN TRANSACTION;
insert into demo.OnDisk (onDiskId,value)
values (1, ‘Connection 1’);

And then on a second connection, execute:

–Connection2
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
go

BEGIN TRANSACTION;
insert into demo.OnDisk (onDiskId,value)
values (1, ‘Connection 2’);
GO

Connection 2 is blocked, waiting to see what happens in the other connection. This is a very major difference between the on disk snapshot isolation level and what we will see in memory’s lockless version. COMMIT the version on connection 1, and in the second connection you will see the error you will expect to see:

Msg 2627, Level 14, State 1, Line 6
Violation of PRIMARY KEY constraint ‘PKOnDisk’. Cannot insert duplicate key in object ‘demo.OnDisk’. The duplicate key value is (1).
The statement has been terminated.

Be sure and clear all transactions on both connection. The data in the table should look like:

select @@TRANCOUNT as trancount, *
from demo.OnDisk;

trancount onDiskId value
———– ———– ——————–
0 1 Connection 1

Now we will try to update the row in two different connections, and as you should expect since the insert was blocked, it should be obvious what is going to happen here too.

Execute:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
go

BEGIN TRANSACTION
update demo.OnDisk
set value = ‘Still Connection 1’
where onDiskId = 1

On one connection, and then afterwards, the following in another:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
go

BEGIN TRANSACTION
update demo.OnDisk
set value = ‘I said Connection 2’
where onDiskId = 1

Blocked. Now, commit the transaction in the first transaction, and you will see:

Msg 3960, Level 16, State 2, Line 5
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table ‘demo.OnDisk’ directly or indirectly in database ‘snapshotDemo’ to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

The key bit of information in there is the word “conflict”. As you will see in the next section of the blog on in memory OLTP tables, how and when a transaction is canceled due to a modification collision is directly related to whether it is a resource conflict, or a conflict with the constraints of a system.

In Memory SNAPSHOT

Turning our attention to the other SNAPSHOT isolation level, let’s take a look at how it does conflict resolution. The biggest difference you will notice is that there is never any connection waiting, unlike with on disk. In any location where you might see a waiting condition, it will either just defer the check until COMMIT time (if there are no resource conflicts) or immediately if there are. When the next iteration of my database design book is released, I will have some discussions about the internals of how in memory objects are stored, but I will not go into it here just yet. In either case, I will state that I will in both places include the following statement: “do you want to know more? Check out Kalen Delaney’s book here: http://www.red-gate.com/community/books/sql-server-internals-in-memory-oltp”, largely because she is a great friend and a great writer/teacher. Secondarily because in either case my plans are to show structures conceptually, strictly as a mental guide to help you understand how your designs are affected by the internals and I don’t want to go any deeper.

Starting with the insert case from before, using the demo.InMem table, on one connection, execute:

–Connection 1
BEGIN TRANSACTION

insert into demo.inMem (inMemId,value)
values (1, ‘Connection 1’)

First thing you will probably notice is a really annoying error message (because later you will need to use a hint to allow the table to be accessed in SNAPSHOT isolation level). Since we had set the isolation level to SNAPSHOT earlier, the following error is thrown:

Msg 41332, Level 16, State 0, Line 4
Memory optimized tables and natively compiled stored procedures cannot be accessed or created when the session TRANSACTION ISOLATION LEVEL is set to SNAPSHOT.

So we change the batch to:

–Connection1
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION

insert into demo.inMem (inMemId,value)
values (1, ‘Connection 1’)

And then in a second connection:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION

insert into demo.inMem (inMemId,value)
values (1, ‘Connection 2’)

Both complete immediately. Looking at the data, you will immediately want to try:

select *
from demo.inMem

But this will result in an error message that will also end the transaction:

Msg 41368, Level 16, State 0, Line 46
Accessing memory optimized tables using the READ COMMITTED isolation level is supported only for autocommit transactions. It is not supported for explicit or implicit transactions. Provide a supported isolation level for the memory optimized table using a table hint, such as WITH (SNAPSHOT).

So don’t do that! Instead, use:

select *
from demo.inMem (SNAPSHOT)

On connection 1, this returns:

inMemId value
———– ——————–
1 Connection 1

Connection 2 believes otherwise:

inMemId value
———– ——————–
1 Connection 2

Next COMMIT Connection 1, and you will see it succeeded. Going to Connection 2, check the data (and the @@trancount) again:

select @@trancount as tranCount, *
from demo.inMem (SNAPSHOT)

On my first try, I really expected it to fail… But it did not (because the value only violates a constraint at a data level, as it is not a resource conflict), and still sticks with its expectation that what it knows about the state of the data is correct:

tranCount InMemId value
———– ———– ——————–
1 1 Connection 2

Now commit the transaction on Connection 2, and you will see the following error:

Msg 41325, Level 16, State 1, Line 17
The current transaction failed to commit due to a serializable validation failure.

Which could be considered as annoying, as it doesn’t initially make sense. Why a serializable validation? Modifications will be done in a serializable isolation level (just like in read committed where they use exclusive locks regardless of you trying to use NOLOCK!). In this case, the other row with the PK of 1 fails the no phantom criteria of serializable, so the failure. Try again to insert the row, you get:

Msg 2627, Level 14, State 1, Line 3
Violation of PRIMARY KEY constraint ‘XPKInMem’. Cannot insert duplicate key in object ‘InMem’. The duplicate key value is (1).
The statement has been terminated.

I won’t go through it, but if you are trying out my code, do the same experiment with different primary key values to see that inserting 1 for the PK, and 2 for the other PK do not collide.

What is interesting is to start over, but this time, on Connection 2 insert the row, fetch it, and then delete it. When you commit the transaction, what happens? (I just decided to try this, and I am not 100% sure myself… I guess it succeeds)

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION

insert into demo.inMem (inMemId,value)
values (1, ‘Connection 2’)
GO

select *
from demo.inMem WITH (SNAPSHOT)
GO

delete from demo.inMem with (SNAPSHOT)
where InMemId = 1

So far so good, this executes, now go back to connection 1 and COMMIT. Go back to Connection 2, and you will see:

Msg 41325, Level 16, State 1, Line 12
The current transaction failed to commit due to a serializable validation failure.

So I was wrong (and started to guess that I was wrong as I kept reasoning through the problem, which is a lot easier when blogging than when trying to solve a customer issue on a high throughput system!). Even though the net effect is that I did nothing to data, we touched the same resources.

Now let’s do the same with an update statement, after clearing the table:

rollback — may not be necessary
go

delete from demo.inMem;

insert into demo.inMem(inMemId, value)
values (1,’First Row’),(2,’Second Row’);

On connection 1, change the value to ‘1st Row’:

–Connection 1
BEGIN TRANSACTION;

update demo.inMem with (SNAPSHOT)
set value = ‘1st Row’
where inMemId = 1;

Now, on a different connection, first set the second row’s value:

–Connection2
BEGIN TRANSACTION;

update demo.inMem with (SNAPSHOT)
set value = ‘2nd Row’
where inMemId = 2;

Now, inside of the same transaction, try to update row 1…

update demo.inMem with (SNAPSHOT)
set value = ‘Not the 2ndRow’
where inMemId = 1;

Immediately you are given an error (well, errors):

Msg 41302, Level 16, State 110, Line 3
The current transaction attempted to update a record that has been updated since this transaction started. The transaction was aborted.

Msg 3998, Level 16, State 1, Line 1
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
The statement has been terminated.

Now, on connection 2 you can see that the data has reverted and the transaction has been rolled back (as the error message said it was!)

select @@trancount as tranCount, *
from demo.inMem (SNAPSHOT)

tranCount   InMemId     value
———– ———– ——————–
0           1           First Row
0           2           Second Row

Conclusion

Concurrency handling/Isolation using optimistic concurrency controls (or semi-optimistic as in on disk’s version of SNAPSHOT) for data modifications is far trickier than using the basic pessimistic lock based methods. In my previous books, I have suggested that you would likely want to shy away from SNAPSHOT for connections that are doing data modifications, mostly because they still used locks, but then fail if the other connection has touched their resources, making the performance gain not nearly worth the effort.

However, In Memory OLTP can EASILY be worth your time as it uses a lock-less/latch-less mechanism that makes any waiting done negligible. The side effect is that you can easily get resource collisions depending on how well you tune your queries, and the likelihood of multiple users accessing the same data for modification (and more, as I will show whenever I get to how retrieval works with modification statements someday.) Add to this things like FOREIGN KEY constraints, multiple UNIQUE constraints, and multiple queries in a stored procedure, and things can get tricky.

The one major change to your thinking is that you need to see the messages like:

Msg 41325, Level 16, State 1, Line 17
The current transaction failed to commit due to a serializable validation failure.

Not as error messages, but merely messages like a deadlock.

Transaction (Process ID %d) was deadlocked on {%Z} resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Warnings to try again. The fact is, that is what the deadlock message states explicitly in the text. You were tangled up with another user. An occasional deadlock (if handled by the code as a simple retry of the batch/transaction batches) is nothing to worry about. Large numbers of them are signs of issues, and I expect that if you see tons of serializable validation failure (not a fun to say as deadlock!) messages that the same will be true. You have concurrency issues.

Register for Simple Talk

Snapshot Isolation Level and Concurrent Modification Collisions – On Disk and In Memory OLTP

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?

Recommended

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?