{"id":96175,"date":"2023-03-30T23:21:49","date_gmt":"2023-03-30T23:21:49","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=96175"},"modified":"2023-03-29T04:54:59","modified_gmt":"2023-03-29T04:54:59","slug":"generate-series-to-build-a-set","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/generate-series-to-build-a-set\/","title":{"rendered":"GENERATE_SERIES: My new go-to to build sets"},"content":{"rendered":"<section id=\"bertrand-blog-post\">I have come across a lot of use cases for manufacturing rows on the fly, aside from the common goal of populating a large data set such as a numbers or calendar table. A few favorites include building sample data, pivoting an unknown number of columns, data extrapolation, and filling gaps in date or time ranges.<\/p>\n<p>If you are on SQL Server 2022 or Azure SQL Database, or have been reading up on new features, you&#8217;ve likely heard about one of the better T-SQL enhancements: a new built-in function called <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/functions\/generate-series-transact-sql\" target=\"_blank\" rel=\"noopener\"><code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GENERATE_SERIES<\/code><\/a>. The syntax is straightforward &#8211; it accepts arguments for <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">start<\/code> and <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">stop<\/code>, and an optional argument to indicate <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">step<\/code> (in case you want to iterate by more than <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">1<\/code>, or backwards):<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> SELECT value FROM GENERATE_SERIES(&lt;start&gt;, &lt;stop&gt; [, &lt;step&gt;]);<\/pre>\n<p>A few quick examples:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> \/* count to 6 *\/ \r\n SELECT [1-6] = value \r\n   FROM GENERATE_SERIES(1, 6);\r\n\r\n \/* count by 5s to 30 *\/\r\n SELECT [step 5] = value \r\n   FROM GENERATE_SERIES(5, 30, 5);\r\n\r\n \/* count from 10 to 0, backwards, by 2 *\/\r\n SELECT [backward] = value\r\n   FROM GENERATE_SERIES(10, 0, -2);\r\n\r\n \/* get all the days in a range, inclusive *\/\r\n DECLARE @start date = '20230401',\r\n         @end   date = '20230406';\r\n\r\n SELECT [days in range]  = DATEADD(DAY, value, @start)\r\n   FROM GENERATE_SERIES(0, DATEDIFF(DAY, @start, @end));<\/pre>\n<p>Results:<\/p>\n<table style=\"border-collapse: collapse; font-size: 0.875rem; line-height: 1.25rem;\">\n<tbody>\n<tr>\n<td style=\"border: 1px solid #000; background: #f4f4f4; vertical-align: middle; padding: 5px 8px!important; text-align: center;\"><strong>1-6<\/strong><br \/>\n(first resultset)<\/td>\n<td rowspan=7 style=\"border:0;padding:5px\">&nbsp;<\/td>\n<td style=\"border: 1px solid #000; background: #f4f4f4; vertical-align: middle; padding: 5px 8px!important; text-align: center;\"><strong>step 5<\/strong><br \/>\n(second resultset)<\/td>\n<td rowspan=7 style=\"border:0;padding:5px\">&nbsp;<\/td>\n<td style=\"border: 1px solid #000; background: #f4f4f4; vertical-align: middle; padding: 5px 8px!important; ;text-align: center;\"><strong>backward<\/strong><br \/>\n(third resultset)<\/td>\n<td rowspan=7 style=\"border:0;padding:5px\">&nbsp;<\/td>\n<td style=\"border: 1px solid #000; background: #f4f4f4; vertical-align: middle; padding: 5px 8px!important; ;text-align: center;\"><strong>days in range<\/strong><br \/>\n(fourth resultset)<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">1<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">5<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">10<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-01<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">10<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">8<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-02<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">3<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">15<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">6<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-03<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">4<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">20<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">4<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-04<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">5<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">25<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2<\/td>\n<td style=\"border: 1px solid #000; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-05<\/td>\n<\/tr>\n<tr>\n<td style=\"border: 1px solid #000; border-bottom-width: 2px; text-align: center; padding: 6px 8px 5px 8px!important;\">6<\/td>\n<td style=\"border: 1px solid #000; border-bottom-width: 2px; text-align: center; padding: 6px 8px 5px 8px!important;\">30<\/td>\n<td style=\"border: 1px solid #000; border-bottom-width: 2px; text-align: center; padding: 6px 8px 5px 8px!important;\">0<\/td>\n<td style=\"border: 1px solid #000; border-bottom-width: 2px; text-align: center; padding: 6px 8px 5px 8px!important;\">2023-04-06<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div style=\"background: #fff; width: 100%; margin-top: -35px;\">\u00a0<\/div>\n<p>That is some handy syntax that is quite easy to use. <a href=\"https:\/\/www.mssqltips.com\/sqlservertip\/7265\/sql-server-2022-t-sql-enhancements\/?utm_source=AaronBertrand\" target=\"_blank\" rel=\"noopener\">I dug in more about it during the beta<\/a>, but&#8230;<\/p>\n<h3>How would we do this on older versions of SQL Server?<\/h3>\n<p>We&#8217;ve been generating sets since before SQL Server was SQL Server, so we&#8217;ve always found a way. Some approaches are cryptic, and some perform poorly; others are cryptic <em>and<\/em> perform poorly. I have two that I like: one that works in SQL Server 2016 and above, and one that works all the way back to SQL Server 2008. There are others (even some that will work on SQL Server 2000), but these are the two I want to focus on today.<\/p>\n<p>I&#8217;m going to present both techniques as inline table-valued functions, since the logic is complicated enough to justify encapsulation, and that also happens to keep demos nice and tidy. These will be written to accommodate a series of up to 4,000 values &#8211; we can certainly go beyond that, but exceeding 8,001 values leads to the first solution requiring LOB support, which can do unpredictable things to performance. The second is capped at 4,096 values because it is the highest power of 4 that is also less than 8,001; you&#8217;ll see why that&#8217;s important in a moment.<\/p>\n<p><span style=\"text-align: center; color: white; background: #c00; border-radius: 5px; font-weight: bold; padding: 3px 9px 2px 8px; display: inline-block!important; margin-right: 0.5rem;\">2016+<\/span> <b>STRING_SPLIT + REPLICATE<\/b><\/p>\n<p>This one is a rather recent addition to my toolbox; I don&#8217;t recall where I first came across it, but I like it because it&#8217;s concise without being overly opaque. We determine the number of values we want in our sequence, less one &#8211; which is the <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">stop<\/code> minus the <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">start<\/code>. We use <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">REPLICATE<\/code> to generate a string that is a sequence of <em>that many<\/em> commas. Then we split that string using <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">STRING_SPLIT<\/code>, which results in <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">{ stop - start + 1 }<\/code> empty strings. We then apply a <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">ROW_NUMBER()<\/code> to the output, which serves as our series. Since our starting value might not be <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">1<\/code>, we add it to the row number, and subtract <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">1<\/code>.<\/p>\n<p>To get started, I will create a new database named <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GenSeries<\/code> to put the sample code.<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE FUNCTION dbo.GenerateSeries_Split\r\n (\r\n   @start int,\r\n   @stop  int\r\n )\r\n RETURNS TABLE WITH SCHEMABINDING\r\n AS\r\n   RETURN\r\n   (\r\n     SELECT TOP (@stop - @start + 1) \r\n       value = ROW_NUMBER() OVER (ORDER BY @@SPID) + @start - 1\r\n     FROM STRING_SPLIT(REPLICATE(',', @stop - @start), ',')\r\n     ORDER BY value\r\n   );<\/pre>\n<p>To support a range greater than 8,001 values, you can change this line:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> FROM STRING_SPLIT(REPLICATE(CONVERT(varchar(max),','), @stop - @start), ',')<\/pre>\n<p>&#8230;but that&#8217;s not the version I&#8217;m going to test today.<\/p>\n<p><!-- People often use <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">ORDER BY (SELECT NULL)<\/code>; I use either <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">@@SPID<\/code> or <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">@@TRANCOUNT<\/code> depending on whether I want to encourage parallelism or avoid <a target=\"_blank\" href=\"https:\/\/dba.stackexchange.com\/q\/169451\/1186\" rel=\"noopener\">this kind of problem<\/a>. --><\/p>\n<p><span style=\"text-align: center; color: white; background: #c00; border-radius: 5px; font-weight: bold; padding: 3px 9px 2px 8px; display: inline-block!important; margin-right: 0.5rem;\">2008+<\/span> <b>Cross-Joined CTEs<\/b><\/p>\n<p>This solution reaches further back into most of the unsupported versions of SQL Server you might still be clinging to but, unfortunately, it is a little more cryptic. I remember first using it in <a href=\"https:\/\/stackoverflow.com\/a\/69541144\/61305\" target=\"_blank\" rel=\"noopener\">this solution<\/a> after discovering <a href=\"https:\/\/www.sqlservercentral.com\/scripts\/string_split-function-for-sql-2012-and-above\" target=\"_blank\" rel=\"noopener\">this really efficient implementation<\/a> by Jonathan Roberts.<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE FUNCTION dbo.GenerateSeries_CTEs\r\n (\r\n   @start int,\r\n   @stop  int\r\n )\r\n RETURNS TABLE WITH SCHEMABINDING \r\n AS \r\n   RETURN\r\n   (\r\n     \/* could work in 2005 by changing VALUES to a UNION ALL *\/\r\n     WITH n(n) AS (SELECT 0 FROM (VALUES (0),(0),(0),(0)) n(n)),\r\n      i4096(n) AS (SELECT 0 FROM n a, n b, n c, n d, n e, n f)      \r\n     SELECT TOP (@stop - @start + 1) \r\n       value = ROW_NUMBER() OVER (ORDER BY @@TRANCOUNT) + @start - 1 \r\n     FROM i4096\r\n     ORDER BY value\r\n   );<\/pre>\n<p>This approach uses two CTEs &#8211; one that just generates 4 rows using a <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">VALUES<\/code> constructor; the second one cross joins it to itself, however many times is necessary to cover the range of values you need to support. (In our case, we want to support 4,000 values.)<\/p>\n<p>Each time you cross join the original set of 4, you produce a Cartesian product of <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^n<\/code>, where <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">n<\/code> is bumped by 1 for each new reference. So if you just named it once, you&#8217;d have <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^1<\/code>, which is 4. The second reference is <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^2<\/code>, which is 16. Then <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^3<\/code> = 64, <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^4<\/code> = 256, <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^5<\/code> = 1,024, and <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">4^6<\/code> = 4,096. I&#8217;ll try to illustrate in an image:<\/p>\n<p><a href=\"\/simple-talk\/wp-content\/uploads\/2023\/03\/ab-gen-series-cross-join-explained.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" style=\"width: 80%; min-width: 320px; max-width: 800px;\" src=\"\/simple-talk\/wp-content\/uploads\/2023\/03\/ab-gen-series-cross-join-explained.png\" alt=\"Explaining cross join powers of 4\" \/><\/a><\/p>\n<p>If you only need to support 256 values, for example, then you could change that second line to stop at the 4th cross join:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\">   i256(n) AS (SELECT 0 FROM n a, n b, n c, n d)<\/pre>\n<p>And if you needed more than 4,096 values &#8211; say, up to 16,384 &#8211; you would instead just add one additional cross join:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\">   i16K(n) AS (SELECT 0 FROM n a, n b, n c, n d, n e, n f, n g)<\/pre>\n<p>And of course you can be more verbose and self-documenting. Technically, I would want to write the following, it&#8217;s just a lot more to digest on first glance:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\">   i4096(n) AS \r\n        (\r\n           SELECT 0 FROM n AS n4    CROSS JOIN n AS n16 \r\n              CROSS JOIN n AS n64   CROSS JOIN n AS n256\r\n              CROSS JOIN n AS n1024 CROSS JOIN n AS n4096\r\n           \/* ... *\/\r\n        )<\/pre>\n<p>You could also code defensively and alter the parameters to <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">smallint<\/code> or <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">tinyint<\/code>\u00a0to prevent surprises when someone uses an <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">int<\/code> value that is too large and they don&#8217;t get the full set they expect. This won&#8217;t raise an error, unless you also add additional handling, say, to divide by 0 somewhere if the range is too large. Keep in mind that someone could try to generate 100 rows by passing in a start parameter of 2,000,000,000 and a stop parameter of 2,000,000,100 &#8211; so restricting either input value instead of the difference might be unnecessarily limiting.<\/p>\n<p>I often see recursive CTEs suggested for set generation, since they are a little less cryptic than this, and are somewhat self-documenting (if you already understand recursive CTEs, I suppose). I do like recursive CTEs generally, and have offered them up in many posts and answers, but they&#8217;re not ideal for broad consumption in this context unless you will never retrieve more than 100 rows (say, generating the days for a monthly report). This is because you will need a <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">MAXRECURSION<\/code> query hint to produce more than 100 values; since you can&#8217;t put that hint inside a function, it means you have to put it on every outer query that references the function. Ick! So much for encapsulation.<\/p>\n<h3>So how do they perform?<\/h3>\n<p>I thought about the simplest test I can do to pit different number generation techniques against each other, and the first that came to mind involves <a href=\"https:\/\/www.red-gate.com\/simple-talk\/homepage\/paging-data-in-t-sql\/\" target=\"_blank\" rel=\"noopener\">pagination<\/a>. <em>(Note: This is a contrived use case and not intended to be a discussion about the best ways to paginate data.)<\/em><\/p>\n<p>In the <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GenSeries<\/code> database, I will create a simple table with 4,000 rows:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> SELECT TOP (4000) rn = IDENTITY(int,1,1),*\r\n INTO dbo.things FROM sys.all_columns;\r\n\r\n CREATE UNIQUE CLUSTERED INDEX cix_things ON dbo.things(rn);<\/pre>\n<p>Then I created three stored procedures. One that uses the split approach:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE OR ALTER PROCEDURE dbo.PaginateCols_Split\r\n   @PageSize int = 100,\r\n   @PageNum  int = 1\r\n AS\r\n BEGIN\r\n   SET NOCOUNT ON;\r\n\r\n   DECLARE @s int = (@PageNum-1) * @PageSize + 1;\r\n   DECLARE @e int = @s + @PageSize - 1;\r\n\r\n   WITH r(rn) AS\r\n   (\r\n     SELECT TOP (@PageSize) rn = value\r\n     FROM dbo.GenerateSeries_Split(@s, @e)\r\n   )\r\n   SELECT t.* FROM dbo.things AS t \r\n   INNER JOIN r ON t.rn = r.rn;\r\n END<\/pre>\n<p>One that uses stacked CTEs:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE OR ALTER PROCEDURE dbo.PaginateCols_CTEs\r\n   @PageSize int = 100,\r\n   @PageNum  int = 1\r\n AS\r\n BEGIN\r\n   SET NOCOUNT ON;\r\n\r\n   DECLARE @s int = (@PageNum-1) * @PageSize + 1;\r\n   DECLARE @e int = @s + @PageSize - 1;\r\n\r\n   WITH r(rn) AS\r\n   (\r\n     SELECT TOP (@PageSize) rn = value\r\n     FROM dbo.GenerateSeries_CTEs(@s, @e)\r\n   )\r\n   SELECT t.* FROM dbo.things AS t \r\n   INNER JOIN r ON t.rn = r.rn;\r\n END<\/pre>\n<p>And one that uses <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GENERATE_SERIES<\/code> directly:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE OR ALTER PROCEDURE dbo.PaginateCols_GenSeries\r\n   @PageSize int = 100,\r\n   @PageNum  int = 1\r\n AS\r\n BEGIN\r\n   SET NOCOUNT ON;\r\n\r\n   DECLARE @s int = (@PageNum-1) * @PageSize + 1;\r\n   DECLARE @e int = @s + @PageSize - 1;\r\n\r\n   WITH r(rn) AS\r\n   (\r\n     SELECT TOP (@PageSize) rn = value\r\n     FROM GENERATE_SERIES(@s, @e)\r\n   )\r\n   SELECT t.* FROM dbo.things AS t \r\n   INNER JOIN r ON t.rn = r.rn;\r\n END<\/pre>\n<p>Then I created a wrapper that will call each of them with a defined page number &#8211; this way I could test the beginning, middle, and end of the set (pagination often sees tanking performance as the page number gets higher). This table is hardly a performance nightmare but if I ran the procedures enough times I would hopefully see some variance.<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> CREATE OR ALTER PROCEDURE dbo.PaginateCols_Wrapper\r\n   @PageNum int = 1\r\n AS\r\n BEGIN\r\n   SET NOCOUNT ON;\r\n\r\n   EXEC dbo.PaginateCols_Split     @PageNum = @PageNum;\r\n   EXEC dbo.PaginateCols_CTEs      @PageNum = @PageNum;\r\n   EXEC dbo.PaginateCols_GenSeries @PageNum = @PageNum;\r\n END<\/pre>\n<p>If you execute this procedure, you will see 3 output sets that contain rows from <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">sys.columns<\/code>. If you vary the <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">@pagenum<\/code> parameter value, you will see different pages of data from that set, but each three will be the same results. The only difference is the series generating code.<\/p>\n<p>I turned on <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/database-administration-sql-server\/the-sql-server-2016-query-store-overview-and-architecture\/\" target=\"_blank\" rel=\"noopener\">Query Store<\/a>, and always want to remind you that <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">QUERY_CAPTURE_MODE = ALL<\/code> is not a production-friendly option &#8211; but quite handy if you want to make sure you capture every instance of every query:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> ALTER DATABASE GenSeries SET QUERY_STORE \r\n (\r\n   OPERATION_MODE              = READ_WRITE,\r\n   QUERY_CAPTURE_MODE          = ALL \/* Do not do this in production! *\/\r\n );<\/pre>\n<p>I didn&#8217;t want to run the procedures a bunch of times manually; I like using <a href=\"https:\/\/www.nuget.org\/packages\/sqlstresscmd\" target=\"_blank\" rel=\"noopener\">sqlstresscmd<\/a> because I can run tests hundreds of thousands of times without guilt about overwhelming a poor UI, or waiting for results to render, or battling resource conflicts and poisoning the test as a result. It runs the queries, discards the results, and that&#8217;s it.<\/p>\n<p>I configured a JSON file called <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GenSeries.json<\/code> like this, to run each procedure 10,000 times across 16 threads. It took about 5 minutes to run on average:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:go decode:true whitespace-before:1 whitespace-after:1\"> {\r\n   \"CollectIoStats\": true,\r\n   \"CollectTimeStats\": true,\r\n   \"MainDbConnectionInfo\": \r\n   {\r\n     \"Database\": \"GenSeries\",\r\n     \"Login\": \"sa\",\r\n     \"Password\": \"$tr0ng_P@$$w0rd\",\r\n     \"Server\": \"127.0.0.1,2022\"\r\n   },\r\n   \"MainQuery\": \"EXEC dbo.PaginateCols_Wrapper @PageNum = 1;\",\r\n   \"NumIterations\": 10000,\r\n   \"NumThreads\": 16,\r\n   \"ShareDbSettings\": true\r\n }<\/pre>\n<p>Then ran it using the following:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> sqlstresscmd -s ~\/Documents\/GenSeries.json<\/pre>\n<p>Then I collected the average runtimes from Query Store:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> SELECT qt.query_sql_text,\r\n        avg_duration       = AVG(rs.avg_duration\/1000.0)\r\n   FROM sys.query_store_query_text AS qt\r\n   INNER JOIN sys.query_store_query AS q \r\n     ON qt.query_text_id = q.query_text_id\r\n   INNER JOIN sys.query_store_plan  AS p \r\n     ON q.query_id = p.query_id\r\n   INNER JOIN sys.query_store_runtime_stats AS rs \r\n     ON p.plan_id = rs.plan_id\r\n   WHERE qt.query_sql_text LIKE N'%dbo.things%'\r\n     AND qt.query_sql_text NOT LIKE N'%sys.query_store%'\r\n   GROUP BY qt.query_sql_text;<\/pre>\n<p>When I wanted to switch to the middle or the end of the set, I ran this query to clear Query Store data. (Note: you will need to capture the results from Query Store each time before executing this statement as this clears everything from Query Store):<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:tsql decode:true whitespace-before:1 whitespace-after:1\"> ALTER DATABASE GenSeries SET QUERY_STORE CLEAR;<\/pre>\n<p>Then I changed the <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">MainQuery<\/code> line appropriately to run tests for the middle and the end. For rows 1,901 &#8211; 2,000:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:go decode:true whitespace-before:1 whitespace-after:1\"> \"MainQuery\": \"EXEC dbo.PaginateCols_Wrapper @PageNum = 20;\",<\/pre>\n<p>And for rows 3,901 &#8211; 4,000:<\/p>\n<pre class=\"theme:tomorrow-night font:consolas font-size:14 line-height:16 decode-attributes:false tab-convert:true lang:go decode:true whitespace-before:1 whitespace-after:1\"> \"MainQuery\": \"EXEC dbo.PaginateCols_Wrapper @PageNum = 40;\",<\/pre>\n<p>Here are the timing results in milliseconds (<a href=\"\/simple-talk\/wp-content\/uploads\/2023\/03\/ab-gen-series-duration.png\" target=\"_blank\" rel=\"noopener\">click to enlarge<\/a>):<\/p>\n<p><a href=\"\/simple-talk\/wp-content\/uploads\/2023\/03\/ab-gen-series-duration.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" style=\"width: 100%; min-width: 320px;\" src=\"\/simple-talk\/wp-content\/uploads\/2023\/03\/ab-gen-series-duration.png\" alt=\"Line graph showing average duration, in milliseconds, of three different series generation techniques\" \/><\/a><\/p>\n<p>In these tests, the split approach was the winner, but the new built-in function is right on its heels. The stacked CTEs, while much more backward-compatible, have become a bit of an outlier.<\/p>\n<p><em>I would love to see some flat lines in there, of course, since there shouldn&#8217;t be any penalty for jumping ahead to any page; but, not the point today. I do plan to revisit some of my old pagination techniques in a future article.<\/em><\/p>\n<h3>Conclusion<\/h3>\n<p>As the title suggests, I&#8217;m pretty happy with the syntax of <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GENERATE_SERIES<\/code> so far, and I hope you get to try it out sooner than later! The performance of the split approach is slightly better, but both are still relatively linear and, for the simplicity of the implementation, I&#8217;d be inclined to use the newer syntax in most cases. At this scale, we&#8217;re talking about single-digit milliseconds anyway, so maybe not all that telling other than &#8220;this is worth testing.&#8221;<\/p>\n<p>And to reiterate, this wasn&#8217;t meant to show that any of these methods might be better for pagination specifically &#8211; it was a completely manufactured scenario where the table just happened to have contiguous row numbers to join to the output. This was more a demonstration of how easy it is to swap <code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">GENERATE_SERIES<\/code> into places where you&#8217;re using more convoluted methods today.<\/p>\n<h4>Further reading<\/h4>\n<p>As far as series generation goes, there are other options out there, too, including some from Paul White, Itzik Ben-Gan, and others in this 6-part <a href=\"https:\/\/sqlperformance.com\/2021\/05\/t-sql-queries\/number-series-solutions-5\">Number series generator challenge<\/a> from 2021. In particular, there is an interesting solution from Paul White (<code style=\"font-family: consolas; background: #e4e4e4; padding: 3px 4px 1px 4px; border-radius: 3px;\">dbo.GetNums_SQLkiwi<\/code>) in <a href=\"https:\/\/sqlperformance.com\/2021\/04\/t-sql-queries\/number-series-solutions-4\" target=\"_blank\" rel=\"noopener\">solutions part 4<\/a>, but it does require a little concentration, and is version-limiting (it requires a table with a clustered columnstore index). You should do more thorough testing with his and other approaches from that series, with your data and workload, especially if your primary objective is squeezing performance. Some solutions will only be options if you are on modern versions and\/or have some leeway in implementation (some CLR solutions might be interesting as well).<\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>I have come across a lot of use cases for manufacturing rows on the fly, aside from the common goal of populating a large data set such as a numbers or calendar table. A few favorites include building sample data, pivoting an unknown number of columns, data extrapolation, and filling gaps in date or time&#8230;&hellip;<\/p>\n","protected":false},"author":341115,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143531],"tags":[147170,4745],"coauthors":[158980],"class_list":["post-96175","post","type-post","status-publish","format-standard","hentry","category-featured","category-t-sql-programming-sql-server","tag-sql-server-2022","tag-transact-sql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96175","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/341115"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=96175"}],"version-history":[{"count":123,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96175\/revisions"}],"predecessor-version":[{"id":96513,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96175\/revisions\/96513"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=96175"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=96175"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=96175"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=96175"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}