SQL Server 2016 ctp 3.2 is already available with a lot of news.
One of the new features available is the ability to compress and decompress fields during DML (insert/select/update) using the functions Compress and Decompress.
One important point to notice is that compressed fields cannot be indexed. On the other hand, if you will not search records by this field, may be the compression can be a good option.
Let’s see an example of how these functions work and the result of the compression.
1 2 3 4 5 |
-- Create the first sample table create table WithoutCompress (id int identity(1,1), longfield varchar(max) ) go |
1 2 3 |
-- Optimize large values storage (see more about this here ) exec sp_tableoption 'WithoutCompress' , 'Large Value Types out of row',true go |
1 2 3 |
-- Insert a thousand of big records insert into WithoutCompress (longfield) values (replicate('this is a compress example',1000)) go 1000 |
1 2 3 4 5 |
-- Check the total pages of the table select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages from sys.dm_db_database_page_allocations (DB_ID('test'),OBJECT_ID('WithoutCompress'),0,1,'DETAILED') group by allocation_unit_type_desc,page_type_desc order by allocation_unit_type_desc |
We have 1012 Text_Mix_Page, besides the other types of pages.
1 2 3 4 5 |
-- Create the 2nd table create table WithCompress (id int identity(1,1), longfield varbinary(max) ) go |
The field with compressed information needs to be varbinary.
1 2 3 |
-- Optimize large values storage exec sp_tableoption 'WithCompress' , 'Large Value Types out of row',true go |
1 2 3 |
-- Insert a thousand large records insert into WithCompress (longfield) values (Compress(replicate('this is a compress example',1000))) go 1000 |
1 2 3 4 5 |
-- Check the total pages of the 2nd table select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages from sys.dm_db_database_page_allocations (DB_ID('teste'),OBJECT_ID('WithCompress'),0,1,'DETAILED') group by allocation_unit_type_desc,page_type_desc order by allocation_unit_type_desc |
Now we have only 24 TEXT_MIX_PAGEs. Of course this ratio depends on the data in the table. If you have data already compressed, like jpg files, for example, the ratio won’t be so high and you will only spend cpu in the compress/decompress process.
We can decompress the information when we query the table. This is a sample query:
1 |
select top 10 decompress(longfield) as longfield from WithCompress |
As you can see in the image above, the result of the decompress function is of type varbinary. We need to cast the result to see the original information. Let’s add the cast to the query:
1 2 |
select top 10 cast(decompress(longfield) as varchar(max)) as longfield from WithCompress |
The compression uses GZIP algorithm, so we can decompress the information in the client application, instead the query. The varbinary field is mapped to byte[] by entity framework. We can use GZipStream to decompress the information in .NET.
This is a sample code to decompress the field and display the result in a windows forms textbox:
1 2 3 4 5 6 |
var rec = (from x in ctx.WithCompresses select x).FirstOrDefault(); MemoryStream ms = new MemoryStream(rec.longfield); GZipStream gz = new GZipStream(ms, CompressionMode.Decompress); StreamReader sr = new StreamReader(gz); textBox2.Text = sr.ReadToEnd(); |
Load comments