Be careful when exchanging old text field by varchar(max)

Image and Text fields are deprecated and should be replaced by varchar(max) and varbinary(max), but there are details about how these fields are stored that should be analyzed.

All these types are called LOB types, types that store Large OBjects. However, there are a big difference between how the old LOB data types (text,ntext, image) and the new LOB data types (varchar(max), nvarchar(max), varbinary(max)) are stored.

While the old LOB types store the data in separated pages, only storing a 16 bit pointer in the row, the new LOB types store the data in row, together the record data, whenever possible. Both, the old and the new LOB types, can be configured to work in an opposite way.

Why should we care? The main concern should be about how many pages SQL Server will need to read to return our data.

There are big differences in the amount of pages SQL Server will need to read. If the LOB fields are not part of the query, separated pages with the LOB data can be ignored, but in row data can’t be ignored and will increase the amount of pages SQL Server will read. On the other hand, when the LOB fields are part of the query and many LOB data are small enough to be stored in row, having the LOB data in row reduces the amount of pages SQL Server needs to read to retrieve our records.

We need to understand the configuration options for these kind of fields to be able to decide the best configuration for each table in our database, according to the most common queries over our tables, if they include the LOB fields or they don’t.

Let’s do some tests with both kind’s of LOB types and compare the results. I will insert two hundred records in LOB fields with different configurations and compare the results.

How old LOB types work

Let’s execute the script bellow to create a new table with a LOB field using text data type and insert two hundred records.

retrieve detailed information about the page allocations, let’s use it to analyze how the information is stored:

LOB1.png

In the above image, we can find the following types of pages:

Data Page: Pages with the row data of the table. We have only one in this case.

IAM Page: Index Allocation Map, these pages works as an index for all allocated pages. There is one for the data pages and one for the lob data pages.

TEXT MIX PAGE: These pages stores the LOB data, you can notice the allocation type as LOB Data.

We can notice in the above results that the default behavior of text field (and all other old LOB data types) is to store the information in separated pages, especially for LOB Data. If we query this table without include the LOB field, SQL Server needs to read only the IAM’s and the in-row data pages, if we include the LOB field all the LOB Data pages needs to be retrieved from the disk.

Let’s execute some queries to prove this. First, without the LOB field:

LOB2.png

You can see in the image below that the query resulted in three (3) logical reads: 2 IAM’s and the in-row data page.

Now let’s execute another query with the LOB field:

Now we have a large amount of LOB logical read, 666 in total.

How the new LOB fields work

Let’s repeat the same test with the new LOB types, so we can compare the results to choose the best configuration for each case.

First, let’s create the table and insert the data:

Now we need to check the page allocation:

You will notice in the above image that we have 38 data pages. This is the main point in this article: While the default behavior of the old LOB type fields is to store the data in LOB data pages, the default behavior of the new LOB type fields is to store all the information in row whenever possible. If the information is bigger than the space in the page it will be stored in LOB data pages.

In this test, queries including the LOB field and queries without the LOB field will have the same performance, because they will need to read the same pages.

There is a different configuration to change the behavior of the new LOB types. We can set the attribute ‘Large value types out of row’ for the table, so all the information in the new LOB types will be stored in LOB data pages.

Let’s drop the table, create again with ‘Large value types out of row’ activated and insert the data again:

LOB5.png

Checking the page allocation again, we will find only one data page and fifty-four (54) LOB data pages. The tests with the queries with and without the LOB field will show the same: Without the LOB field only the data pages will be read, so the query will perform much better than the one that includes the LOB field.

Conclusion

The new LOB types uses exactly the opposite default storage pattern than the old LOB types. The text and image fields are deprecated, but when you switch to the new varchar(max) and varbinary(max) you will be changing the way the information is stored and the performance of your environment, unless you set the ‘Large value types out of row’ in all your tables.