Image and Text fields are deprecated and should be replaced by varchar(max) and varbinary(max), but there are details about how these fields are stored that should be analyzed.
All these types are called LOB types, types that store Large OBjects. However, there are a big difference between how the old LOB data types (text,ntext, image) and the new LOB data types (varchar(max), nvarchar(max), varbinary(max)) are stored.
While the old LOB types store the data in separated pages, only storing a 16 bit pointer in the row, the new LOB types store the data in row, together the record data, whenever possible. Both, the old and the new LOB types, can be configured to work in an opposite way.
Why should we care? The main concern should be about how many pages SQL Server will need to read to return our data.
There are big differences in the amount of pages SQL Server will need to read. If the LOB fields are not part of the query, separated pages with the LOB data can be ignored, but in row data can’t be ignored and will increase the amount of pages SQL Server will read. On the other hand, when the LOB fields are part of the query and many LOB data are small enough to be stored in row, having the LOB data in row reduces the amount of pages SQL Server needs to read to retrieve our records.
We need to understand the configuration options for these kind of fields to be able to decide the best configuration for each table in our database, according to the most common queries over our tables, if they include the LOB fields or they don’t.
Let’s do some tests with both kind’s of LOB types and compare the results. I will insert two hundred records in LOB fields with different configurations and compare the results.
How old LOB types work
Let’s execute the script bellow to create a new table with a LOB field using text data type and insert two hundred records.
1 2 3 4 5 6 7 8 9 10 11 |
create table LOB1 (id int identity(1,1) not null primary key, name varchar(50), lfield text ) go insert into LOB1 values('John',replicate('x',900)) go 100 -- The batch will be repeated one hundred times insert into LOB1 values('John',replicate('x',2000)) go 100 |
retrieve detailed information about the page allocations, let’s use it to analyze how the information is stored:
1 2 3 4 5 6 7 8 |
select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages from sys.dm_db_database_page_allocations (DB_ID('testDB'),OBJECT_ID('LOB1'),1,1,'DETAILED') group by allocation_unit_type_desc,page_type_desc order by allocation_unit_type_desc |
In the above image, we can find the following types of pages:
Data Page: Pages with the row data of the table. We have only one in this case.
IAM Page: Index Allocation Map, these pages works as an index for all allocated pages. There is one for the data pages and one for the lob data pages.
TEXT MIX PAGE: These pages stores the LOB data, you can notice the allocation type as LOB Data.
We can notice in the above results that the default behavior of text field (and all other old LOB data types) is to store the information in separated pages, especially for LOB Data. If we query this table without include the LOB field, SQL Server needs to read only the IAM’s and the in-row data pages, if we include the LOB field all the LOB Data pages needs to be retrieved from the disk.
Let’s execute some queries to prove this. First, without the LOB field:
1 2 |
set statistics io on select id,name from LOB1 |
You can see in the image below that the query resulted in three (3) logical reads: 2 IAM’s and the in-row data page.
Now let’s execute another query with the LOB field:
1 |
select id,name, lfield from LOB1 |
Now we have a large amount of LOB logical read, 666 in total.
How the new LOB fields work
Let’s repeat the same test with the new LOB types, so we can compare the results to choose the best configuration for each case.
First, let’s create the table and insert the data:
1 2 3 4 5 6 7 8 9 10 11 |
create table LOB2 (id int identity(1,1) not null primary key, name varchar(50), lfield varchar(max) ) go insert into LOB2 values('John',replicate('x',900)) go 100 insert into LOB2 values('John',replicate('x',2000)) go 100 |
Now we need to check the page allocation:
1 2 3 4 5 6 |
select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages from sys.dm_db_database_page_allocations(DB_ID('testDB'), OBJECT_ID('LOB2'),1,1,'DETAILED') group by allocation_unit_type_desc,page_type_desc order by allocation_unit_type_desc |
You will notice in the above image that we have 38 data pages. This is the main point in this article: While the default behavior of the old LOB type fields is to store the data in LOB data pages, the default behavior of the new LOB type fields is to store all the information in row whenever possible. If the information is bigger than the space in the page it will be stored in LOB data pages.
In this test, queries including the LOB field and queries without the LOB field will have the same performance, because they will need to read the same pages.
There is a different configuration to change the behavior of the new LOB types. We can set the attribute ‘Large value types out of row’ for the table, so all the information in the new LOB types will be stored in LOB data pages.
Let’s drop the table, create again with ‘Large value types out of row’ activated and insert the data again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
drop table LOB2 go create table LOB2 (id int identity(1,1) not null primary key, name varchar(50), lfield varchar(max) ) go exec sp_tableoption 'LOB2' , 'Large Value Types out of row',true insert into LOB2 values('John',replicate('x',900)) go 100 insert into LOB2 values('John',replicate('x',2000)) go 100 |
Checking the page allocation again, we will find only one data page and fifty-four (54) LOB data pages. The tests with the queries with and without the LOB field will show the same: Without the LOB field only the data pages will be read, so the query will perform much better than the one that includes the LOB field.
Conclusion
The new LOB types uses exactly the opposite default storage pattern than the old LOB types. The text and image fields are deprecated, but when you switch to the new varchar(max) and varbinary(max) you will be changing the way the information is stored and the performance of your environment, unless you set the ‘Large value types out of row’ in all your tables.
Load comments