Columnstore Queries in SQL Server 2012

On demo, the columnstore index of SQL Server 2012 gives dazzling performance, but it is optimised for data warehouse queries so it is by no means a universal route to high-performance queries. Once you understand the context in which they are best used, and the ways of ensuring that they work as intended, they can be extremely useful.

A Columnstore index behaves much like any other index, but with certain restrictions. Under the covers, there are profound differences, mainly in the way that the data is stored. In a Columnstore Index the data is indexed column-wise, while normally it is row-wise. At one time, it was considered that it was next to impossible to leverage Columnstore technology in a database for anything other than specialist applications, but the new Columnstore index in SQL Server has proved to be more widely valuable for accessing any data that is immutable.

1831-RowStore.PNG

1831-ColumnStore.PNG

The Columnstore index stores each column in a separate set of disk pages. In a normal index, one page could contain multiple rows.

As you can see in this image, many rows of data are stored in one page in a B-tree or a heap in a conventional index.

 In a Columnstore, by contrast, a page will contain data from just one column (Multiple records). If a query needs to fetch data from 15% of the total columns that exists in a table, It can fetch data for just those columns and so does not need to read as much data as with a conventional index.

 In addition, the columnstore reduces the storage requirements by compressing the data in the Columnstore index. This reduces the I/O that is required to get the index data.

 In a B-Tree, it has to fetch the whole row in order to get the columns required, which results in a higher I/O.

Where is it used?

In general, the Columnstore is more suited for use with a Data warehouse than in OLTP databases. The performance of queries in Data Warehouses can be improved by a hundredfold in almost all cases by using a Columnstore Index and the enhanced query processing engine. The Columnstore index is designed to work well when scanning and aggregating data in large fact tables in star join queries. However, there are plenty of cases where a Columnstore index does not perform as well as a B-Tree index.

  1. A Columnstore index does not perform well when running an OLTP style query that does point lookups, or where every column is fetched from a wide row. Multi-column joins and fact-to-fact table joins are not performed well when using a Columnstore index.
  2. If the predicates contain OR statements, the database engine generates a less efficient plan.
  3. A Columnstore index is most efficient when using numeric data types when filtering.
  4. An inefficient plan is produced for a UNION ALL.
  5. The  Columnstore index is inefficient when the query optimiser’s batch processing mode cannot be used and there are a large number of JOINs.
  6. The use of IF EXISTS, IN, NOT IN or OUTER JOIN will give bad performance.

There are some tricks to work around these issues which we’ll discuss later in the article

The ETL time has been reduced by the use of the ColumnStore index because time is not lost in performing pre-built summary aggregates. One of the disadvantages of using pre-built aggregates was that it is difficult to design and maintain them. When you have a Columnstore index, you do not have to worry too much if the query is changed.

Your BI end-users will be encouraged to work more with the data since queries take so much less time when there is a Columnstore index.

Why is it fast?

As stated above, Columnstore reduces the inherent I/O bottleneck of SQL Server by storing each column index in a separate set of pages. In addition: 

  1. It is easier to compress data since the data in columns generally has more duplication than a row
  2. When executing a query, it needs to fetch only the data for the columns that are needed in the result set or the predicate. Since the columns are in separate sets of pages, it can fetch just the data for the column that is needed and does not need to go through the whole record set each time.
  3. The buffer hit-ratio is improved because data is highly compressed and all frequently-accessed column data will be kept in memory.

Another reason why the query execution time is greatly reduced is that the engine uses a new technology called Apollo. Apollo is the code name for the new functionality that is now available in SQL 2012 that introduces a vector-based query execution that can process data in batches.

There is also another reason why the Columnstore index is so effective. When the index is created, it breaks it down into one million row chunks of data. These chunks are called segments and each segment will have the minimum and maximum value of each column. Therefore the engine can check the filter conditions against this metadata and skip large chunks of data to speed up the scans.

So what is Batch mode processing?

Have you heard of RBAR? One of the main problems when writing a query was that we sometimes use a Row-by-Row method to do certain actions. At one time SQL Server was only capable of this intensive type of operation. With batch mode, the processing is done in Batches. A batch is typically a thousand rows of data. Batch mode processing uses a new algorithm for enhanced parallelism that is optimized for MultiCore CPUs and memory found in next generation hardware. Batch mode splits the workload by spreading the cost of overheads  such as accessing the metadata over all the rows in a batch in one go rather than paying the cost for one row at a time. Batch mode eliminates some of the exchange operators used by Row-by-Row method and it also operates on compressed data. This results in much better performance and it does parallelism much efficiently.

The batch mode processing can only be used where there is a Columnstore. These two technologies are closely linked. If the the optimizer thinks that there is no need for Batch execution mode even though a Columnstore index being used, then it will use Row-by-Row processing. We can find out if a query is using batch mode or Row-by-Row mode by looking at the execution plan. In the execution plan you will see a new operator with its own new icon called batchHashTableBuild. You will also see a new property in the execution plan. The values for this property can either be ROW or BATCH. The new property in the actual execution plan is known as ActualExecutionMode and in the estimated execution plan it is known as EstimatedExecutionPlan. As with any plans, the actual and execution plan can differ. When the plan is formulated, the optimizer chooses whether it should do a batch or a row processing. Let us say that the plan was formulated with Batch. At the time of execution, this could change from a batch plan  to a row plan if, for example, there are not enough threads or memory at the time of execution.

If you find that the execution starts running in parallel mode but, later on, changes to row mode processing (Serial) then you’ll suspect that the there was a problem with memory. This will be confirmed if you see an xevent that is fired if  there is not enough memory when doing a hash join.

Creating a Columnstore index

The syntax for creating a ColumnStore index is almost the same as when creating any other normal index. It takes three times longer to create a Columnstore index than a clustered index on the same data. Here is an easy CREATE INDEX statement for a Columnstore Index.

Instead of using T-SQL directly, you can also use SSMS to create the Columnstore index. One thing to keep in mind is that, when using the SSMS GUI, the index-creation might fail due to timeout. For this reason, it is better to use T-SQL when creating a Columnstore index. Imagine creating a clustered index on a table that has billions of rows and then multiply the time taken by three. That will give a rough estimate of how much time it can take to create a Columnstore index.

When creating a Columnstore index, three things happen behind the scenes. Row groups are created with one millions rows of data in one group and this is partitioned horizontally. The columns are then partitioned vertically into segments. Finally, last but not least, there is process of compressing the data.

 There are some things to note when creating a Columnstore index.

  1. You can give the column names in any order when creating it, without affecting the index.
  2. You can create a Columnstore index with some or all of the columns of the table.
  3. The index cannot have more than 1024 columns
  4. Not all data types can be used in a Columnstore index.
  5. The Columnstore index has to be non-clustered. (I heard that they are planning to have a clustered Columnstore index soon).
  6. The Columnstore Index cannot be a primary key, a foreign key or a unique constraint.
  7. There can only be one Columnstore index on any table
  8. A columnstore index can’t be created on a view
  9. A Columnstore index cannot include a computed, sparse or FILESTREAM column.
  10. The Columnstore index cannot be a filtered index. It has to have all rows in the table defined in the index.
  11. The Columnstore index is usually created using parallel processing. Please keep in mind that the creation of index will use parallel processing only if there are more than a million rows in the table.
  12. The Columnstore index does not use or keep statistics. The query optimiser uses the statistics of the heap or other indexes when trying to create a plan.

Maximizing The Performance Gain From A Columnstore When Writing Queries

Even though there are plenty of performance limitations to the Columnstore index, there are ways to get around them.

One of the main reasons why Columnstore index is efficient is that the queries that use them are amenable to batch execution mode. We must therefore be sure that the value of MAXDOP has not been set to one on the instance level because batch execution modes uses parallelism.

In the beginning of this article there was a section that stated that if we use UNION ALL, the performance of the query is not good. The main reason behind it is that when using UNION ALL, the engine does not execute it in Batch mode. It uses a Row-by-Row mode. However, one can get around this restriction by using CTEs for each SELECT statement that is part of the UNION ALL and then combine the results, or alternatively you can write sub queries for each table that is in the UNION ALL and combine the result set.

Another way to get bad performance when using Columnstore index is to use an IN or EXIST clause because the engine sometimes does not use Batch execution mode. If you can avoid the use of IN or EXIST by using   JOIN, the engine will use Batch for processing the data and the query will run faster

When writing queries, we have to encourage the database engine to use Batch execution mode to speed up the query significantly. We can do that by using CTEs, Subqueries and inner JOINS.

Do index hints work?

There are cases when the query optimizer incorrectly chooses a B-tree or a heap in preference to the columnstore. In these circumstances, we can force the optimizer to use the columnstore index by using the index hint just as we do for a B-Tree index.

There are other cases where the query optimizer wrongly chooses the columnstore. In these cases, we can prevent this by using the keyword IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX. The syntax for specifying this is as given below.

One of the main disadvantages in using the columnstore index is that we cannot do any updates on that table. In a warehouse this is not a big issue since the data is loaded nightly. There are usually no data changes done other than the nightly loads. In these cases, we can just drop the columnstore index and recreate it after the data has been loaded or updated. Another way to insert data is by using partition switching.

Conclusion

The introduction of the columnstore index and the improvements in the query processing engine with breakthrough technologies has given a big performance boost for data warehouse query processing. We have seen the overall view of the columnstore index and the improved query processing engine.

References