{"id":96516,"date":"2023-04-07T13:26:34","date_gmt":"2023-04-07T13:26:34","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=96516"},"modified":"2026-03-09T14:50:19","modified_gmt":"2026-03-09T14:50:19","slug":"ordered-columnstore-indexes-in-sql-server-2022","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/ordered-columnstore-indexes-in-sql-server-2022\/","title":{"rendered":"Ordered Columnstore Indexes in SQL Server 2022: Usage &#038; Limits"},"content":{"rendered":"\n<p>SQL Server 2022 introduced the ORDER clause for clustered columnstore indexes, allowing you to specify a sort column when creating or rebuilding the index: CREATE CLUSTERED COLUMNSTORE INDEX CCI ON Table ORDER (DateColumn). When data is ordered, rowgroup min\/max ranges become tight and non-overlapping, enabling effective rowgroup elimination during queries.<\/p>\n\n\n\n<p>Before SQL Server 2022, achieving this required manually ordering data through a clustered B-tree index before converting to columnstore. The ORDER clause automates this, but with limitations: it only applies during CREATE INDEX and ALTER INDEX REBUILD operations, not during ongoing INSERT operations. New data inserted after the index is built may not maintain order. This article demonstrates the feature, measures its performance impact, and evaluates its real-world limitations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-introduction\">Introduction<\/h2>\n\n\n\n<p>One of the more challenging technical details of columnstore indexes that regularly gets attention is the need for data to be ordered to allow for segment elimination. In a non-clustered columnstore index, data order is automatically applied based on the order of the underlying rowstore data. In a clustered columnstore index, though, data order is not enforced by any SQL Server process. This leaves managing data order to us, which may or may not be an easy task.<\/p>\n\n\n\n<p>To assist with this challenge, SQL Server 2022 has added the ability to specify an <code>ORDER<\/code> clause when creating or rebuilding an index. This feature allows data to be automatically sorted by SQL Server as part of those insert or rebuild processes. This article dives into this feature, exploring both its usage and its limitations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-speedy-review-of-data-order\">Speedy Review of Data Order<\/h2>\n\n\n\n<p>To fully appreciate the impact of data order on a columnstore index, a quick demo will be presented to show its impact on segment elimination and the resulting effect on performance.<\/p>\n\n\n\n<p>The following T-SQL creates a table with a columnstore index and inserts ~7.1 million rows into it (this script uses the <code>WideWorldImportersDW<\/code> database. You can download this database here):<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CREATE TABLE dbo.fact_order_BIG_CCI (\n     [Order Key] [bigint] NOT NULL,\n     [City Key] [int] NOT NULL,\n     [Customer Key] [int] NOT NULL,\n     [Stock Item Key] [int] NOT NULL,\n     [Order Date Key] [date] NOT NULL,\n     [Picked Date Key] [date] NULL,\n     [Salesperson Key] [int] NOT NULL,\n     [Picker Key] [int] NULL,\n     [WWI Order ID] [int] NOT NULL,\n     [WWI Backorder ID] [int] NULL,\n     [Description] [nvarchar](100) NOT NULL,\n     [Package] [nvarchar](50) NOT NULL,\n     [Quantity] [int] NOT NULL,\n     [Unit Price] [decimal](18, 2) NOT NULL,\n     [Tax Rate] [decimal](18, 3) NOT NULL,\n     [Total Excluding Tax] [decimal](18, 2) NOT NULL,\n     [Tax Amount] [decimal](18, 2) NOT NULL,\n     [Total Including Tax] [decimal](18, 2) NOT NULL,\n     [Lineage Key] [int] NOT NULL);\n\n-- Generate 7,173,772 rows in a heap\nINSERT INTO dbo.fact_order_BIG_CCI\nSELECT\n     [Order Key] + (250000 * ([Day Number] + \n             ([Calendar Month Number] * 31))) AS [Order Key]\n    ,[City Key]\n    ,[Customer Key]\n    ,[Stock Item Key]\n    ,[Order Date Key]\n    ,[Picked Date Key]\n    ,[Salesperson Key]\n    ,[Picker Key]\n    ,[WWI Order ID]\n    ,[WWI Backorder ID]\n    ,[Description]\n    ,[Package]\n    ,[Quantity]\n    ,[Unit Price]\n    ,[Tax Rate]\n    ,[Total Excluding Tax]\n    ,[Tax Amount]\n    ,[Total Including Tax]\n    ,[Lineage Key]\nFROM Fact.[Order]\nCROSS JOIN\nDimension.Date\nWHERE Date.Date &lt;= '2013-01-31';\n\n-- Create a columnstore index on the table.\nCREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_order_BIG_CCI \n                                         ON dbo.fact_order_BIG_CCI;<\/pre>\n\n\n\n<p>No optimizations are made or consideration taken regarding data order. It\u2019s a columnstore index sitting on top of a pile of data as-is. An assumption will be made here that this data will be searched most frequently based on date. If so, the ideal data order would be by the <code>Order Date Key<\/code> column. To verify the order of rows with respect to <code>Order Date Key<\/code>, the following query can be used to show the minimum and maximum values for that column within each rowgroup:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT\n     tables.name AS table_name,\n     indexes.name AS index_name,\n     columns.name AS column_name,\n     partitions.partition_number,\n     column_store_segments.segment_id,\n     column_store_segments.min_data_id,\n     column_store_segments.max_data_id,\n     column_store_segments.row_count\nFROM sys.column_store_segments\nINNER JOIN sys.partitions\nON column_store_segments.hobt_id = partitions.hobt_id\nINNER JOIN sys.indexes\nON indexes.index_id = partitions.index_id\nAND indexes.object_id = partitions.object_id\nINNER JOIN sys.tables\nON tables.object_id = indexes.object_id\nINNER JOIN sys.columns\nON tables.object_id = columns.object_id\nAND column_store_segments.column_id = columns.column_id\nWHERE tables.name = 'fact_order_BIG_CCI'\nAND columns.name = 'Order Date Key'\nORDER BY tables.name, columns.name, column_store_segments.segment_id;<\/pre>\n\n\n\n<p>The results are as follows (Your results should be similar, but likely differ a little bit, this is normal. In some executions I had an additional partition created):<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"740\" height=\"183\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-1.jpeg\" alt=\"\" class=\"wp-image-96517\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Each row is a single rowgroup within the columnstore index. Note that for each rowgroup, the <code>min_data_id<\/code> and <code>max_data_id<\/code> are the same. This indicates that the data is unordered, and the same set of disparate values is scattered across all rowgroups. The effect of this can be illustrated by running a simple analytic query.:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SET STATISTICS IO ON; --Output the IO that was needed to execute the query\n\nSELECT SUM([Total Excluding Tax]) AS [Total Excluding Tax]\nFROM dbo.fact_order_BIG_CCI\nWHERE [Order Date Key] = '2014-12-04';<\/pre>\n\n\n\n<p>This query returns a sum for only a single date value. When executed, the output of statistics IO shows the following:<\/p>\n\n\n\n<p><code>Table 'fact_order_BIG_CCI'. Scan count 1, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 1540, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.<\/code><\/p>\n\n\n\n<p><code>Table 'fact_order_BIG_CCI'. Segment reads 8, segment skipped 0.<\/code><\/p>\n\n\n\n<p>Note that 8 segments were read and 0 were skipped. This indicates that data had to be read from all rowgroups in order to return information for a single date. The reads (1547) are suspiciously high for such a simple query as well.<\/p>\n\n\n\n<p><strong>Read also:<\/strong> <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/hands-on-with-columnstore-indexes-part-2-best-practices-and-guidelines\/\" target=\"_blank\" rel=\"noreferrer noopener\">Columnstore best practices: data ordering (Part 2)<\/a><\/p>\n\n\n\n<p>The next step for this demonstration is to order this data and rerun the above queries again.<\/p>\n\n\n\n<p>To do this, the columnstore index will be swapped out for a clustered rowstore index that is ordered by <code>Order Date Key<\/code>. The clustered rowstore index will then be swapped for a clustered columnstore index. <code>MAXDOP<\/code> of 1 ensures that parallelism does not inadvertently result in unordered data via multiple streams of ordered data being shuffled together into new rowgroups: (The clustered index I create first orders the data in the object before we create the clustered columnstore index.)<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CREATE CLUSTERED INDEX CCI_fact_order_BIG_CCI \n       ON dbo.fact_order_BIG_CCI ([Order Date Key]) WITH (DROP_EXISTING = ON);\n\nCREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_order_BIG_CCI \n       ON dbo.fact_order_BIG_CCI WITH (DROP_EXISTING = ON, MAXDOP = 1);<\/pre>\n\n\n\n<p>Rerunning the rowgroup metadata query from earlier returns the following:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"749\" height=\"151\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-2.jpeg\" alt=\"\" class=\"wp-image-96518\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>The minimum and maximum values for Order Date key are now nicely ordered from smallest to largest values. Running the simple analytic query from earlier produces the following statistics IO output:<\/p>\n\n\n\n<p><code>Table 'fact_order_BIG_CCI'. Scan count 1, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 388, lob physical reads 0, lob page server reads 0, lob read-ahead reads 836, lob page server read-ahead reads 0.<\/code><\/p>\n\n\n\n<p><code>Table 'fact_order_BIG_CCI'. Segment reads 1, segment skipped 6.<\/code><\/p>\n\n\n\n<p>Only 1 segment was read whereas 6 were skipped. Note that the logical IO is significantly less than before, by about a factor of 10!<\/p>\n\n\n\n<p>Filtering a large OLAP data set on ordered columnstore data will be significantly faster and more efficient than unordered data. But how is data ordered normally over the life span of a table? The most common solution is to:<\/p>\n\n\n<div class=\"block-core-list\">\n<ol class=\"wp-block-list\">\n<li>Structure an analytic table to only accept insert and delete operations<\/li>\n\n\n\n<li>Avoid updates at all costs.<\/li>\n\n\n\n<li>Insert new data at the end of the table for the most recent date dimension.<\/li>\n<\/ol>\n<\/div>\n\n\n<p>Alternatively, if a large table needs to be converted to a columnstore index, then trickery as demonstrated above is necessary, ordering with a rowstore index and then swapping in a columnstore index on top of the newly ordered table.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-introducing-ordered-columnstore-indexes\">Introducing: Ordered Columnstore Indexes!<\/h2>\n\n\n\n<p>SQL Server 2022 adds the <code>ORDER<\/code> clause to the clustered columnstore index creation syntax. This column list specifies which columns to sort columnstore data by as part of the rebuild operation or as part of an <code>INSERT<\/code> operation.<\/p>\n\n\n\n<p>To alter our existing columnstore index to use this feature, the following syntax can be used:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_order_BIG_CCI ON dbo.fact_order_BIG_CCI\nORDER ([Order Date Key])\n WITH (DROP_EXISTING = ON, MAXDOP = 1);<\/pre>\n\n\n\n<p>The result of this change formalizes the order of the columnstore index to default to using <code>Order Date Key<\/code>. When the <code>ORDER<\/code> keyword is included in a columnstore index create statement, SQL Server will sort the data in <code>TempDB<\/code> based on the column(s) specified. In addition, when new data is inserted into the columnstore index, it will be pre-sorted as well. When an ordered clustered columnstore index is the target of an index rebuild operation, the column order specified earlier will be honored.<\/p>\n\n\n\n<p>This feature sounds like the perfect solution to enable segment elimination, but it is in no way free. As implied by the use of the word <code>SORT<\/code>, SQL Server needs to expend resources to sort the data. This not only requires <code>TempDB<\/code> resources, but it is an <code>OFFLINE<\/code> operation. Therefore, swapping a clustered columnstore index for one that is sorted will result in rebuild operations going from being online to offline operations. Note that in a partitioned table, then operation will only be offline for partitions affected by the rebuild operation. Typically, this will be a single current-data partition. Therefore, partitioning can help in improving availability for much of the data in ordered clustered columnstore indexes.<\/p>\n\n\n\n<p>The sort operation can be visualized by checking the execution plan for both a standard and ordered columnstore index.<\/p>\n\n\n\n<p>If you look at the query plan for creating thee indexes, you will see some interesting differences. The following plan is for creating the non-ordered columnstore index built earlier in this article:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"805\" height=\"322\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-3.jpeg\" alt=\"\" class=\"wp-image-96519\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>No surprises here. I wait 30 seconds and a columnstore index is ready for use on the table. Using the new syntax above that includes the <code>ORDER<\/code> clause will produce the following execution plan:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"964\" height=\"313\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-4.jpeg\" alt=\"\" class=\"wp-image-96520\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>A <code>SORT<\/code> operation appears in the plan. Without partitioning or any other aids available, SQL Server needs to sort the entire data set in order to create an index that is ordered on the column of my choosing.<\/p>\n\n\n\n<p>The column(s) used in the <code>ORDER<\/code> clause can be validated anytime in the system view <code>sys.index_columns<\/code>, using a query similar to this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT\n     tables.name AS table_name,\n     indexes.name AS index_name,\n     columns.name AS column_name,\n     index_columns.column_store_order_ordinal\nFROM sys.index_columns\nINNER JOIN sys.indexes\nON indexes.index_id = index_columns.index_id\nAND indexes.object_id = index_columns.object_id\nINNER JOIN sys.columns\nON index_columns.object_id = columns.object_id\nAND columns.column_id = index_columns.column_id\nINNER JOIN sys.tables\nON tables.object_id = indexes.object_id\nWHERE tables.name = 'fact_order_BIG_CCI';<\/pre>\n\n\n\n<p>The results list all of the columns in the table, along with their column store order, if applicable. A value of zero indicates that the column is not a part of the columnstore order:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"539\" height=\"213\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-5.jpeg\" alt=\"\" class=\"wp-image-96521\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>The metadata shows that <code>Order Date Key<\/code> is the sole <code>ORDER<\/code> column for this columnstore index. Consider a new index ordered by <code>Order Date Key<\/code> and then by <code>Order Key<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_order_BIG_CCI ON dbo.fact_order_BIG_CCI\nORDER ([Order Date Key], [Order Key])\nWITH (DROP_EXISTING = ON, MAXDOP = 1);<\/pre>\n\n\n\n<p>For this index, the metadata results from above show the additional column as part of the columnstore order:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"533\" height=\"218\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-6.jpeg\" alt=\"\" class=\"wp-image-96522\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Note that <code>Order Key<\/code> is now shows as the second column in the columnstore order, after <code>Order Date Key<\/code>. For a large columnstore indexed table, ordering by an additional column may further assist queries that rely on multi-column filters.<\/p>\n\n\n\n<p><strong>Read also:<\/strong><br><a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/what-are-columnstore-indexes\/\" target=\"_blank\" rel=\"noreferrer noopener\">What Are Columnstore Indexes?<\/a><br><a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/hands-on-with-columnstore-indexes-part-1-architecture\/\" target=\"_blank\" rel=\"noreferrer noopener\">Columnstore index architecture (Part 1)<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-ordered-columnstore-index-notes-and-considerations\">Ordered Columnstore Index Notes and Considerations<\/h2>\n\n\n\n<p>It is worth repeating that only clustered columnstore indexes can be use the <code>ORDER<\/code> clause. Non-clustered columnstore indexes inherit data order from the underlying rowstore index structure during creation and will provide segment elimination as effectively as the underlying data allows.<\/p>\n\n\n\n<p>Ordered columnstore indexes are a heavy-handed and somewhat imperfect way of improving data order and segment elimination for three reasons:<\/p>\n\n\n<div class=\"block-core-list\">\n<ol class=\"wp-block-list\">\n<li>Rebuilding an ordered columnstore index is an <code>OFFLINE<\/code> operation.<\/li>\n\n\n\n<li>Any data that is sorted must be sorted in <code>TempDB<\/code>.<\/li>\n\n\n\n<li>If a sort operation is so large that it would spill to disk, then the sort is aborted and data order may not be perfect.<\/li>\n<\/ol>\n<\/div>\n\n\n<p>There is value in exploring each of these issues to ensure that decisions made regarding ordered columnstore indexes are as optimal as possible:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-rebuilds-on-an-ordered-columnstore-index-offline-operation-per-partition\">Rebuilds on an Ordered Columnstore Index = OFFLINE Operation (Per Partition)<\/h3>\n\n\n\n<p>This is an availability concern that will not impact all users of columnstore indexes. If dedicated maintenance windows exist for index maintenance and <code>OFFLINE<\/code> is acceptable for rebuild operations, then there is no issue to be had here. Index reorganization is still an <code>ONLINE<\/code> operation and can be used as part of regular maintenance to reduce the need for rebuilds.<\/p>\n\n\n\n<p>The amount of time that a given partition is unavailable will be equal to the amount of time needed to sort and rebuild the data in that partition. Therefore, if rebuilding a partition with 20 million rows in a clustered columnstore index typically takes 30 seconds, then the same operation would result in at least 30 seconds of offline\/downtime for the same index if it were ordered. Presumably the sort operation will require some additional time, increasing the overall time needed to some number greater than 30 seconds.<\/p>\n\n\n\n<p>It is possible to use partition switching to increase availability, though whether that level of added complexity is desired or not is wholly up to you to decide.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-ordered-columnstore-indexes-sort-in-tempdb\">Ordered Columnstore Indexes Sort in TempDB<\/h3>\n\n\n\n<p>Data needs to be sorted, and in SQL Server, <code>TempDB<\/code> is where this happens. As a result, to ensure that data is properly sorted, it is important that <code>TempDB<\/code> has enough space allocated for it. If 500MB of data is to be inserted into an ordered columnstore index, then there needs to be an additional 500MB of <code>TempDB<\/code> space available.<\/p>\n\n\n\n<p>Similarly, if a 500GB columnstore index is being rebuilt into an ordered columnstore index, then 500GB+ of <code>TempDB<\/code> space will be required. Depending on the organization and its hardware,<\/p>\n\n\n\n<p>If there is insufficient <code>TempDB<\/code> space available to store the incoming columnstore data, then don\u2019t use this feature. Ensuring there is enough space, though, becomes more important due to the next challenge:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-sort-operations-do-not-complete-in-their-entirety\">Sort Operations Do Not Complete in Their Entirety<\/h3>\n\n\n\n<p>What happens when an ordered columnstore index is rebuilt or inserted into and not enough space is available in <code>TempDB<\/code>? Not what is expected! Typically, if a sort operator in an execution plan requires more <code>TempDB<\/code> space than is allocated or available, it spills to disk to allocate enough space to complete the operation.<\/p>\n\n\n\n<p>That is NOT what happens with ordered columnstore indexes. If the data to be sorted into an ordered columnstore index exceeds the space allocated by the memory grant, then the sort operation simply completes and is flagged internally as a <strong><em>soft sort<\/em><\/strong>. A soft sort will sort as many rows as it can, but when space runs out, it will stop. While this behavior ensures that write operations are fast and do not spill to disk, they also pose a hazard as it is not immediately clear that this behavior occurred without further research.<\/p>\n\n\n\n<p>To illustrate a soft sort, the test table earlier will be dropped, recreated, and populated with ~27.8 million rows. <code>TempDB<\/code> will also be reduced to a relatively small size to ensure that there is not enough space available to fully sort this data. The following code handles the table drop, creation, and population:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">DROP TABLE dbo.fact_order_BIG_CCI;\n\nCREATE TABLE dbo.fact_order_BIG_CCI (\n     [Order Key] [bigint] NOT NULL,\n     [City Key] [int] NOT NULL,\n     [Customer Key] [int] NOT NULL,\n     [Stock Item Key] [int] NOT NULL,\n     [Order Date Key] [date] NOT NULL,\n     [Picked Date Key] [date] NULL,\n     [Salesperson Key] [int] NOT NULL,\n     [Picker Key] [int] NULL,\n     [WWI Order ID] [int] NOT NULL,\n     [WWI Backorder ID] [int] NULL,\n     [Description] [nvarchar](100) NOT NULL,\n     [Package] [nvarchar](50) NOT NULL,\n     [Quantity] [int] NOT NULL,\n     [Unit Price] [decimal](18, 2) NOT NULL,\n     [Tax Rate] [decimal](18, 3) NOT NULL,\n     [Total Excluding Tax] [decimal](18, 2) NOT NULL,\n     [Tax Amount] [decimal](18, 2) NOT NULL,\n     [Total Including Tax] [decimal](18, 2) NOT NULL,\n     [Lineage Key] [int] NOT NULL);\n\n-- Generate 27769440 rows in a heap:\nINSERT INTO dbo.fact_order_BIG_CCI\nSELECT\n     [Order Key] + (250000 * ([Day Number] +\n            ([Calendar Month Number] * 31))) AS [Order Key]\n    ,[City Key]\n    ,[Customer Key]\n    ,[Stock Item Key]\n    ,[Order Date Key]\n    ,[Picked Date Key]\n    ,[Salesperson Key]\n    ,[Picker Key]\n    ,[WWI Order ID]\n    ,[WWI Backorder ID]\n    ,[Description]\n    ,[Package]\n    ,[Quantity]\n    ,[Unit Price]\n    ,[Tax Rate]\n    ,[Total Excluding Tax]\n    ,[Tax Amount]\n    ,[Total Including Tax]\n    ,[Lineage Key]\nFROM Fact.[Order]\nCROSS JOIN\nDimension.Date\nWHERE Date.Date &lt;= '2013-04-30'; \t<\/pre>\n\n\n\n<p>With a heap of data waiting to be sorted, trace flag 8666 will be enabled:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">DBCC TRACEON (8666);<\/pre>\n\n\n\n<p>This undocumented trace flag will provide additional query optimizer details that otherwise are not provided in execution plans (graphical or XML). <em>If you experiment with this trace flag, be sure to do so in a test environment and disable it when your testing is complete! <\/em><\/p>\n\n\n\n<p>With added optimizer detail available, let\u2019s turn on the actual execution plan, so we can quickly visualize what is going on here:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"457\" height=\"92\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-7.jpeg\" alt=\"\" class=\"wp-image-96523\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Finally, it is time to build an ordered columnstore index on the larger table created above:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_order_BIG_CCI ON dbo.fact_order_BIG_CCI\nORDER ([Order Date Key], [Order Key])\nWITH (MAXDOP = 1)<\/pre>\n\n\n\n<p>This takes a few minutes to complete on my local server. The execution plan now contains some added information that can be used to understand what happened when SQL Server sorted the incoming data:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"881\" height=\"318\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-8.jpeg\" alt=\"\" class=\"wp-image-96524\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Opening the properties for the Sort operator allows us to get some additional detail:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"253\" height=\"662\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-9.jpeg\" alt=\"\" class=\"wp-image-96525\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Within the internal debugger info is a flag that indicates if this was a soft sort or not. In this case, the answer is yes!<\/p>\n\n\n\n<p>With that validated, the metadata query from earlier in this article will be run to show how well-ordered the rowgroups in the columnstore index are:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"741\" height=\"212\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/03\/word-image-96516-10.jpeg\" alt=\"\" class=\"wp-image-96526\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>There are 27 rowgroups in total, but it is clear from the first 9 that this data is not actually ordered on Order Date Key effectively. While examining the values for <code>min_data_id<\/code> and <code>max_data_id<\/code> shows SOME ordering, it is clear that a query that should read 1-2 rowgroups will instead read many more as so many value ranges overlap.<\/p>\n\n\n\n<p>To summarize: soft sorting is a hazard for building and maintaining ordered columnstore indexes and may be a reason to hold off on using them until the feature is further refined.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p>Prior to this feature, order was maintained manually. This sounds painful, but in reality, analytic data is often created based on date\/time and naturally orders itself based on some key dimension(s), often date\/time. For example, a table that loads new sales data daily will add a new day\u2019s worth of data to the end of the columnstore index each day. So if updates are not performed on this table, then the data will naturally remain ordered as new data will also be the most recent. Even if 5% of the incoming data were old and out-of-order, overall segment elimination would still be relatively good.<\/p>\n\n\n\n<p>Ordered columnstore indexes are an attempt to bring more focus on rowgroup elimination and its importance in speeding up analytic queries. If a columnstore index has 1 billion rows, then it will contain approximately 1,000 rowgroups. Reading 1000 segments to get data on a single column is far more expensive than reading 3 segments. Despite its promises, ordered columnstore indexes are not perfect and have limitations that make them a less-than-ideal solution, especially for tables where large amounts of data are inserted or rebuilt regularly.<\/p>\n\n\n\n<p>This feature may work for you but be sure to test and ensure that insert operations and rebuilds do indeed sort the data correctly. If not, it may be worth reverting to the old-fashioned way of managing this data using natural ordering via manual processes.<\/p>\n\n\n\n<p>Let\u2019s hope for feature improvements! Columnstore indexes are a prime example of a feature that has improved steadily with every version of SQL Server since its inception. In the same way that columnstore indexes have matured to allow a wider variety of write operations, rowgroup elimination for more data types and predicates, and improved rowgroup consolidation\/maintenance, it is likely that this feature will also get a much-needed upgrade soon.<\/p>\n\n\n\n<p><strong>Read also:<\/strong> <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/ai-in-sql-server-2025-embeddings\/\" target=\"_blank\" rel=\"noreferrer noopener\">AI embeddings in SQL Server 2025<\/a><\/p>\n\n\n\n<section id=\"my-first-block-block_320c21ffd9c98f1fc54ee2a7ec5446d2\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Fast, reliable and consistent SQL Server development&#8230;<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            &#8230;with SQL Toolbelt Essentials. 10 ingeniously simple tools for accelerating development, reducing risk, and standardizing workflows.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/products\/sql-toolbelt-essentials\/\" class=\"btn btn--secondary btn--lg\">Learn more &amp; try for free<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: Ordered columnstore indexes in SQL Server 2022<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. How do you create an ordered columnstore index in SQL Server 2022?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Use the ORDER clause: CREATE CLUSTERED COLUMNSTORE INDEX CCI ON dbo.YourTable ORDER ([DateColumn]). You can also specify ORDER during a rebuild: ALTER INDEX CCI ON dbo.YourTable REBUILD WITH (ORDER ([DateColumn])). The ORDER clause sorts data within each partition, creating tight min\/max ranges in rowgroup metadata that enable efficient rowgroup elimination during queries.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. What are the limitations of ordered columnstore indexes?<\/h3>\n            <div class=\"faq-answer\">\n                <p>The ORDER clause only applies during index creation and rebuilds, not during ongoing INSERT operations. New rows inserted after the index is built land in delta rowgroups and may not maintain the specified order when compressed. Large bulk inserts may also not sort correctly. This means periodic rebuilds may still be needed to maintain optimal ordering. Additionally, the feature is limited to clustered columnstore indexes &#8211; nonclustered columnstore indexes inherit order from the underlying rowstore.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. Do you still need to manually order data before creating a columnstore index?<\/h3>\n            <div class=\"faq-answer\">\n                <p>In SQL Server 2022+, the ORDER clause handles sorting during creation and rebuilds. However, for ongoing data loads, you may still need to ensure data arrives in order, or schedule periodic rebuilds with ORDER to maintain rowgroup elimination effectiveness. In earlier SQL Server versions, the manual approach (create B-tree clustered index on the order column, then convert to columnstore with DROP_EXISTING) remains the only option.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to use the ORDER clause with columnstore indexes in SQL Server 2022. Covers automatic data ordering, rowgroup elimination improvements, rebuild behavior, and current limitations.&hellip;<\/p>\n","protected":false},"author":329827,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143531],"tags":[],"coauthors":[101655],"class_list":["post-96516","post","type-post","status-publish","format-standard","hentry","category-featured","category-t-sql-programming-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96516","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/329827"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=96516"}],"version-history":[{"count":7,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96516\/revisions"}],"predecessor-version":[{"id":109084,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96516\/revisions\/109084"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=96516"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=96516"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=96516"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=96516"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}