Compress and Decompress in SQL Server 2016

SQL Server 2016 ctp 3.2 is already available with a lot of news.

One of the new features available is the ability to compress and decompress fields during DML (insert/select/update) using the functions Compress and Decompress.

One important point to notice is that compressed fields cannot be indexed. On the other hand, if you will not search records by this field, may be the compression can be a good option.

Let’s see an example of how these functions work and the result of the compression.

-- Create the first sample table

create table WithoutCompress

(id int identity(1,1),

longfield varchar(max) )

-- Optimize large values storage (see more about this here )

exec sp_tableoption 'WithoutCompress' , 'Large Value Types out of row',true

-- Insert a thousand of big records

insert into WithoutCompress (longfield) values (replicate('this is a compress example',1000))

go 1000

-- Check the total pages of the table

select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages

from sys.dm_db_database_page_allocations (DB_ID('test'),OBJECT_ID('WithoutCompress'),0,1,'DETAILED')

group by allocation_unit_type_desc,page_type_desc

order by allocation_unit_type_desc

We have 1012 Text_Mix_Page, besides the other types of pages.

-- Create the 2nd table

create table WithCompress

(id int identity(1,1),

longfield varbinary(max) )

The field with compressed information needs to be varbinary.

-- Optimize large values storage

exec sp_tableoption 'WithCompress' , 'Large Value Types out of row',true

-- Insert a thousand large records

insert into WithCompress (longfield) values (Compress(replicate('this is a compress example',1000)))

go 1000

-- Check the total pages of the 2nd table

select allocation_unit_type_desc,page_type_desc, count(*) as TotalPages

from sys.dm_db_database_page_allocations (DB_ID('teste'),OBJECT_ID('WithCompress'),0,1,'DETAILED')

group by allocation_unit_type_desc,page_type_desc

order by allocation_unit_type_desc

Now we have only 24 TEXT_MIX_PAGEs. Of course this ratio depends on the data in the table. If you have data already compressed, like jpg files, for example, the ratio won’t be so high and you will only spend cpu in the compress/decompress process.

We can decompress the information when we query the table. This is a sample query:

1	select top 10 decompress(longfield) as longfield from WithCompress

As you can see in the image above, the result of the decompress function is of type varbinary. We need to cast the result to see the original information. Let’s add the cast to the query:

1 2	select top 10 cast(decompress(longfield) as varchar(max)) as longfield from WithCompress

The compression uses GZIP algorithm, so we can decompress the information in the client application, instead the query. The varbinary field is mapped to byte[] by entity framework. We can use GZipStream to decompress the information in .NET.

This is a sample code to decompress the field and display the result in a windows forms textbox:

var rec = (from x in ctx.WithCompresses

select x).FirstOrDefault();

MemoryStream ms = new MemoryStream(rec.longfield);

GZipStream gz = new GZipStream(ms, CompressionMode.Decompress);

StreamReader sr = new StreamReader(gz);

textBox2.Text = sr.ReadToEnd();

Register for Simple Talk

Compress and Decompress in SQL Server 2016

About the author

Dennes Torres

Dennes's contributions

Articles

Books

Top topics

Dennes's latest contributions:

Vibe Coding or Not Coding: What’s in the middle

Using Parameters with DAX in Report Builder

Techniques to query Azure SQL’s new JSON Datatype

Recommended

About the author

Dennes Torres

Dennes's contributions

Articles

Books

Top topics

Dennes's latest contributions:

Vibe Coding or Not Coding: What’s in the middle

Using Parameters with DAX in Report Builder

Techniques to query Azure SQL’s new JSON Datatype