{"id":92375,"date":"2021-09-07T17:18:41","date_gmt":"2021-09-07T17:18:41","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=92375"},"modified":"2022-01-19T17:04:07","modified_gmt":"2022-01-19T17:04:07","slug":"how-lag-compares-to-other-techniques","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/how-lag-compares-to-other-techniques\/","title":{"rendered":"How LAG compares to other techniques"},"content":{"rendered":"<p>Microsoft introduced the first window (aka, windowing or windowed) functions with SQL Server 2005. These functions were <code>ROW_NUMBER<\/code>, <code>RANK<\/code>, <code>DENSE_RANK<\/code>, <code>NTILE,<\/code> and the window aggregates. Many folks, including myself, used these functions without realizing they were part of a special group. In 2012, Microsoft added several more: <code>LAG<\/code> and <code>LEAD<\/code>, <code>FIRST_VALUE<\/code> and <code>LAST_VALUE<\/code>, <code>PERCENT_RANK<\/code> and <code>CUME_DIST<\/code>, <code>PERCENTILE_CONT,<\/code> and <code>PERCENTILE_DISC<\/code>. They also added the ability to do running totals and moving calculations.<\/p>\n<p>These functions were promoted as improving performance over older techniques, but that isn\u2019t always the case. There were still performance problems with the aggregate functions introduced in 2005 and the four of the functions introduced in 2012. In 2019, Microsoft introduced <a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/reduce-cpu-of-large-analytic-queries-without-changing-code\/\">Batch Mode on Row Store<\/a>, available on Enterprise and Developer Editions, that can improve the performance of window aggregates and the four statistical functions from 2012.<\/p>\n<p>I started writing this article to compare some window function solutions to traditional solutions. I found that there were so many ways to write a query that includes a column from another row that this article is dedicated to the window functions <code>LAG<\/code> and <code>LEAD<\/code>.<\/p>\n<h2>Include a column from another row using LAG<\/h2>\n<p>Including a column from another row typically means some a self-join. The code is somewhat difficult to write and doesn\u2019t perform well. The <code>LAG<\/code> function can be used to pull in the previous row without a self-join. As long as there are adequate indexes in place, using <code>LAG<\/code> solves those problems. <code>LEAD<\/code> works the same way as <code>LAG<\/code>, except that it grabs a later row.<\/p>\n<p>Here\u2019s an example using <code>LAG<\/code> in the AdventureWorks database:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SET STATISTICS IO, TIME ON;\r\n--LAG\r\nSELECT PROD.ProductID, PROD.Name, SOH.OrderDate,\r\n    DATEDIFF(DAY, LAG(SOH.OrderDate) \r\n    OVER(PARTITION BY PROD.ProductID \r\n    ORDER BY SOH.OrderDate),SOH.OrderDate) AS DaysBetweenOrders\r\nFROM Production.Product AS PROD \r\nJOIN Sales.SalesOrderDetail AS SOD \r\nON SOD.ProductID = PROD.ProductID\r\nJOIN Sales.SalesOrderHeader AS SOH \r\nON SOH.SalesOrderID = SOD.SalesOrderID\r\nGROUP BY PROD.ProductID, PROD.Name, SOH.OrderDate\r\nORDER BY PROD.ProductID, SOH.OrderDate;<\/pre>\n<p>This query returns a list of products and the dates they were ordered. It uses the <code>DATEDIFF<\/code> function to compare the current order date to the prior order date determined with <code>LAG<\/code>. The <code>OVER<\/code> clause uses a <code>PARTITION<\/code> <code>BY<\/code> on <code>ProductID<\/code> to ensure that different products are not compared. The <code>OVER<\/code> clause for <code>LAG<\/code> requires an <code>ORDER<\/code> <code>BY<\/code>. It is ordered by <code>OrderDate<\/code> since that is how the rows should be lined up to find the previous date. Figure 1 shows the partial results.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92376\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/table-description-automatically-generated-with-me.png\" alt=\"Image showing the partial results\" width=\"919\" height=\"348\" \/><\/p>\n<p><strong>Figure 1. The partial results of the query using LAG<\/strong><\/p>\n<p>The query ran in under a second, about 300 ms, and required just 365 logical reads, as shown in Figure 2.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92377\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/text-description-automatically-generated.png\" alt=\"Image showing logical reads from LAG\" width=\"844\" height=\"157\" \/><\/p>\n<p><strong>Figure 2. The logical reads of the LAG query<\/strong><\/p>\n<p>I tried several methods to see if it was possible to write a query that performed well without LAG. Even though the database is small, the queries took some time to run, depending on the technique.<\/p>\n<h2>Self-join<\/h2>\n<p>The self-join technique is painfully slow. Keeping the first order row of each <code>ProductID<\/code> in the results required <code>LEFT<\/code> <code>JOINS<\/code>, but this was so slow, I killed the query after a few minutes. Instead, the following query omits the <code>NULL<\/code> row for each product.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">--Self-join\r\nSELECT PROD.ProductID, PROD.Name, SOH.OrderDate,\r\n   DATEDIFF(DAY, MAX(SOH2.OrderDate), SOH.OrderDate) AS DaysBetweenOrders\r\nFROM Production.Product AS PROD \r\nJOIN Sales.SalesOrderDetail AS SOD \r\nON SOD.ProductID = PROD.ProductID\r\nJOIN Sales.SalesOrderHeader AS SOH \r\nON SOH.SalesOrderID = SOD.SalesOrderID\r\nJOIN Sales.SalesOrderDetail AS SOD2 \r\nON SOD2.ProductID = PROD.ProductID\r\nJOIN Sales.SalesOrderHeader AS SOH2 \r\nON SOH2.SalesOrderID = SOD2.SalesOrderID \r\nWHERE SOH2.OrderDate &lt; SOH.OrderDate\r\nGROUP BY PROD.ProductID\r\n   , PROD.Name\r\n   , SOH.OrderDate\r\n   , SOD2.ProductID\r\nORDER BY PROD.ProductID, SOH.OrderDate;<\/pre>\n<p>The query ran in 20 seconds and had 3,103 logical reads, as shown in Figure 3.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92378\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/text-description-automatically-generated-1.png\" alt=\"Statistics IO from self-join\" width=\"879\" height=\"161\" \/><\/p>\n<p><strong>Figure 3. Logical reads of self-join query<\/strong><\/p>\n<p>The query uses the <code>MAX<\/code> function To find the previous <code>OrderDate<\/code> and filters to find rows in <code>SOH2<\/code> with <code>OrderDate<\/code> less than the <code>OrderDate<\/code> in <code>SOD<\/code>.<\/p>\n<h2>Derived table<\/h2>\n<p>Is it possible to improve the performance of the self-join with a derived table? Here&#8217;s a query to find out:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Derived tables\r\nSELECT S1.ProductID, S1.Name, S1.OrderDate, \r\n\tDATEDIFF(DAY,MAX(S2.OrderDate),S1.OrderDate) AS DaysBetweenOrders\r\nFROM \r\n\t(SELECT Prod.ProductID, Prod.Name, SOH.OrderDate \r\n\tFROM Production.Product AS PROD \r\n\tJOIN Sales.SalesOrderDetail AS SOD \r\n\tON SOD.ProductID = PROD.ProductID\r\n\tJOIN Sales.SalesOrderHeader AS SOH \r\n\tON SOH.SalesOrderID = SOD.SalesOrderID\r\n\tGROUP BY PROD.ProductID\r\n           , PROD.Name\r\n           , SOH.OrderDate\r\n\t\t) AS S1\r\n\tLEFT JOIN (\r\n\tSELECT SOD.ProductID, SOH.OrderDate \r\n\tFROM Sales.SalesOrderDetail AS SOD \r\n\tJOIN Sales.SalesOrderHeader AS SOH \r\n\tON SOH.SalesOrderID = SOD.SalesOrderID\r\n\tGROUP BY SOD.ProductID\r\n           , SOH.OrderDate\r\n\t\t) AS S2\r\nON S2.ProductID = S1.ProductID AND S1.OrderDate &gt; S2.OrderDate\r\nGROUP BY  S1.ProductID\r\n        , S1.Name\r\n        , S1.OrderDate\r\nORDER BY S1.ProductID, S1.OrderDate;<\/pre>\n<p>I was surprised to find that this query ran in 2 seconds even though the logical reads were much higher!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92379\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/text-description-automatically-generated-2.png\" alt=\"Statistics IO from derived table\" width=\"859\" height=\"148\" \/><\/p>\n<p><strong>Figure 4. Logical reads of derived table query<\/strong><\/p>\n<p>That\u2019s quite good, though not as good as the query with LAG.<\/p>\n<h2>Common table expression<\/h2>\n<p>Another way to solve the query is with a common table expression (CTE):<\/p>\n<pre class=\"lang:tsql decode:true\">--CTE\r\nWITH Products AS (\r\n  SELECT PROD.ProductID, PROD.Name, SOH.OrderDate\r\n  FROM Production.Product AS PROD\r\n  JOIN Sales.SalesOrderDetail AS SOD\r\n  ON SOD.ProductID = PROD.ProductID\r\n  JOIN Sales.SalesOrderHeader AS SOH\r\n  ON SOH.SalesOrderID = SOD.SalesOrderID\r\n  GROUP BY PROD.ProductID, PROD.Name, SOH.OrderDate\r\n)\r\nSELECT P1.ProductID, P1.Name, P1.OrderDate,\r\nDATEDIFF(DAY, MAX(P2.OrderDate), P1.OrderDate) AS DaysBetweenOrders\r\nFROM Products P1\r\nLEFT JOIN Products P2\r\nON P2.ProductID = P1.ProductID\r\nWHERE P1.OrderDate &gt; P2.OrderDate\r\nGROUP BY P1.ProductID, P1.Name, P1.OrderDate;<\/pre>\n<p>&nbsp;<\/p>\n<p>This query also ran in 2 seconds with the same logical reads as the derived table. What\u2019s going on here? The CTE is not saving the results to be reused \u2013 the tables are accessed twice \u2013 but the joins and some other operators are different. The optimizer was able to come up with better plans by using derived tables or a CTE.<\/p>\n<h2>OUTER APPLY<\/h2>\n<p>The <code>APPLY<\/code> operator is often used to improve the performance of queries. There are two flavors: <code>CROSS<\/code> <code>APPLY<\/code> and <code>OUTER<\/code> <code>APPLY<\/code>, similar to <code>JOIN<\/code> and <code>LEFT<\/code> <code>JOIN<\/code>. The <code>APPLY<\/code> operator can be used to solve many interesting queries, and in this case, it <code>OUTER<\/code> <code>APPLY<\/code> replaces the <code>LEFT<\/code> <code>JOIN<\/code>.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--OUTER APPLY\r\nSELECT PROD.ProductID, PROD.Name, SOH.OrderDate,\r\n   DATEDIFF(DAY, S2.PrevOrderDate, SOH.OrderDate) AS DaysBetweenOrders\r\nFROM Production.Product AS PROD \r\nJOIN Sales.SalesOrderDetail AS SOD \r\nON SOD.ProductID = PROD.ProductID\r\nJOIN Sales.SalesOrderHeader AS SOH \r\nON SOH.SalesOrderID = SOD.SalesOrderID\r\nOUTER APPLY (\r\n\tSELECT MAX(SOH2.OrderDate) AS PrevOrderDate \r\n\tFROM Sales.SalesOrderDetail AS SOD2 \r\n\tJOIN Sales.SalesOrderHeader AS SOH2 \r\n\tON SOH2.SalesOrderID = SOD2.SalesOrderID \r\n\tWHERE SOD2.ProductID = PROD.ProductID \r\n\t\tAND SOH2.OrderDate &lt; SOH.OrderDate) S2\r\nGROUP BY  DATEDIFF(DAY, S2.PrevOrderDate, SOH.OrderDate)\r\n        , PROD.ProductID\r\n        , PROD.Name\r\n        , SOH.OrderDate\r\nORDER BY PROD.ProductID, SOH.OrderDate;<\/pre>\n<p>This query ran in 12 seconds and had a whopping 86,281,577 logical reads!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92380\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/text-description-automatically-generated-3.png\" alt=\"Statistic IO from OUTER APPLY\" width=\"939\" height=\"181\" \/><\/p>\n<p><strong>Figure 5. Logical reads from OUTER APPLY query<\/strong><\/p>\n<p><code>OUTER<\/code> <code>APPLY<\/code> is acting like a function call in this scenario calling <code>OUTER<\/code> <code>APPLY<\/code> once for each row in the outer query, which would probably not be the case if the less than operator was not involved.\u00a0<\/p>\n<p>It\u2019s possible to use <code>TOP(1)<\/code> instead of <code>MAX<\/code>, but then the query inside <code>OUTER APPLY<\/code> must also be ordered, and the results are not any better.<\/p>\n<h2>Temp Table<\/h2>\n<p>Since whatever method is used needs a distinct list of <code>ProductID<\/code>, <code>Name<\/code>, and <code>OrderDate<\/code>, these rows could be stored in a temp table.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Create temp table\r\nDROP TABLE IF EXISTS #ProductList;\r\nSELECT PROD.ProductID, PROD.Name, SOH.OrderDate\r\nINTO #ProductList\r\nFROM Production.Product AS PROD \r\nJOIN Sales.SalesOrderDetail AS SOD\r\nON SOD.ProductID = PROD.ProductID\r\nJOIN Sales.SalesOrderHeader AS SOH\r\nON SOH.SalesOrderID = SOD.SalesOrderID\r\nGROUP BY  PROD.ProductID\r\n        , PROD.Name\r\n        , SOH.OrderDate;<\/pre>\n<p>Here\u2019s an example where the temp table was used with <code>OUTER<\/code> <code>APPLY<\/code>:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--OUTER APPLY with Temp table\r\nSELECT PL.ProductID, PL.Name, PL.OrderDate,\r\n   DATEDIFF(DAY, PrevOrder.OrderDate, PL.OrderDate) AS DaysBetweenOrders\r\nFROM #ProductList PL\r\nOUTER APPLY (\r\n\tSELECT MAX(PL1.OrderDate) AS OrderDate\r\n\tFROM #ProductList AS PL1\r\n\tWHERE PL1.ProductID = PL.ProductID \r\n            AND PL1.OrderDate &lt; PL.OrderDate \r\n\t) AS PrevOrder\r\nGROUP BY  DATEDIFF(DAY, PrevOrder.OrderDate, PL.OrderDate)\r\n        , PL.ProductID\r\n        , PL.Name\r\n        , PL.OrderDate\r\nORDER BY PL.ProductID, PL.OrderDate;<\/pre>\n<p>The temp table drastically improved the time, 1.4 seconds. The actual tables were only touched once to create the temp table. Then the temp table was scanned twice and joined with a Nested Loop.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92381\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/a-picture-containing-diagram-description-automati.png\" alt=\"Image showing execution plan for temp table query\" width=\"1266\" height=\"404\" \/><\/p>\n<p><strong>Figure 6. Partial execution plan when using a temp table<\/strong><\/p>\n<p>Adding an index to the temp table might improve the performance even more. This script ran in about 700 ms.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Indexed temp table and OUTER APPLY\r\nCREATE CLUSTERED INDEX IDX_ProductList ON #ProductList \r\n\t(ProductID, OrderDate);\r\nSELECT PL.ProductID, PL.Name, PL.OrderDate,\r\n   DATEDIFF(DAY, PrevOrder.OrderDate, PL.OrderDate) AS DaysBetweenOrders\r\nFROM #ProductList PL\r\nOUTER APPLY (\r\n\tSELECT MAX(PL1.OrderDate) AS OrderDate\r\n\tFROM #ProductList AS PL1\r\n\tWHERE PL1.ProductID = PL.ProductID \r\n        AND PL1.OrderDate &lt; PL.OrderDate \r\n\t) AS PrevOrder\r\nGROUP BY  DATEDIFF(DAY, PrevOrder.OrderDate, PL.OrderDate)\r\n        , PL.ProductID\r\n        , PL.Name\r\n        , PL.OrderDate\r\nORDER BY PL.ProductID, PL.OrderDate;<\/pre>\n<p>The self-join was also improved with the temp table, returning in just 3 seconds even while using <code>LEFT<\/code> <code>JOIN,<\/code> which wasn\u2019t possible before.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Self-join with temp table\r\nSELECT PL.ProductID, PL.Name, PL.OrderDate,\r\n   DATEDIFF(DAY, MAX(PL2.OrderDate), PL.OrderDate) AS DaysBetweenOrders\r\nFROM #ProductList AS PL \r\nLEFT JOIN #ProductList AS PL2 \r\nON PL2.ProductID = PL.ProductID AND PL.OrderDate &gt; PL2.OrderDate\r\nGROUP BY PL.ProductID\r\n       , PL.Name\r\n       , PL.OrderDate\r\nORDER BY PL.ProductID, PL.OrderDate;<\/pre>\n<p>&nbsp;<\/p>\n<h2>Scalar user-defined function<\/h2>\n<p>The performance of user-defined scalar functions was improved in 2019 with <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/performance-sql-server\/get-your-scalar-udfs-to-run-faster-without-code-changes\/\">inlining<\/a>. I had hoped that the performance would not be too bad since SQL Server is running version 2019. Unfortunately, the less than operator (&lt;) kills the performance. Even adding in an index on <code>OrderDate<\/code> didn\u2019t help much. I killed the query after 25 seconds.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Scalar UDF\r\nGO\r\nCREATE OR ALTER   FUNCTION [dbo].[GetPreviousOrderDate]\r\n(\r\n\t\r\n\t@ProductID INT, @OrderDate DATETIME\r\n)\r\nRETURNS DATETIME\r\nAS\r\nBEGIN\r\n\t\r\n\tDECLARE @PrevOrderDate DATETIME;\r\n\tSELECT @PrevOrderDate = MAX(OrderDate) \r\n\tFROM Sales.SalesOrderHeader AS SOH \r\n\tJOIN Sales.SalesOrderDetail AS SOD \r\n\tON SOD.SalesOrderID = SOH.SalesOrderID\r\n\tWHERE SOD.ProductID = @ProductID AND SOH.OrderDate &lt; @OrderDate;\r\n\t-- Return the result of the function\r\n\tRETURN @PrevOrderDate;\r\nEND;\r\nGO\r\nCREATE INDEX test_index ON Sales.SalesOrderHeader (OrderDate);\r\nGO\r\n--Function call\r\nSELECT PL.ProductID, PL.Name, PL.OrderDate, \r\n\tdbo.GetPreviousOrderDate(PL.ProductID, PL.OrderDate)\r\nFROM #ProductList AS PL\r\nGROUP BY PL.ProductID\r\n       , PL.Name\r\n       , PL.OrderDate\r\nORDER BY PL.ProductID, PL.OrderDate;<\/pre>\n<p>Since I killed the query before it completed, I am not sure how long it would keep running. The execution looks simple enough, but it doesn\u2019t show that the function is called many times.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92382\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/graphical-user-interface-application-description.png\" alt=\"Execution plan for UDF\" width=\"1136\" height=\"507\" \/><\/p>\n<p><strong>Figure 7. The execution plan for the scalar UDF<\/strong><\/p>\n<h2>Table-valued functions<\/h2>\n<p>I\u2019ve often heard someone say, \u201cjust turn it into a table-valued function\u201d when UDF issues arise. However, it\u2019s still possible to do \u201cbad things\u201d with table-valued functions as well. There are two types of table-valued functions, multi-statement and inline. Multi-statement table-valued functions (MSTVF) can have loops, IF blocks, and table variables, so they do not scale well.<\/p>\n<p>Inline table-valued functions (ITVF) only allow a single query. Of course, the function could contain a poorly written query, but generally, you will see better performance with these. In this case, however, the performance is still not as good as using <code>LAG<\/code>. Note that the call to the ITVF also uses <code>OUTER<\/code> <code>APPLY<\/code>.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">--Inline table-valued function\r\nGO\r\nCREATE OR ALTER FUNCTION dbo.ITVF_GetPrevDate\r\n(\t\r\n\t@ProductID INT, @OrderDate DATETIME\r\n)\r\nRETURNS TABLE \r\nAS\r\nRETURN \r\n(\r\n\t\r\n\tSELECT MAX(SOH.OrderDate) AS PrevOrderDate \r\n\tFROM Sales.SalesOrderHeader AS SOH \r\n\tJOIN Sales.SalesOrderDetail AS SOD\r\n\tON SOD.SalesOrderID = SOH.SalesOrderID\r\n\tWHERE SOD.ProductID = @ProductID\r\n\t\tAND SOH.OrderDate &lt; @OrderDate\r\n)\r\nGO\r\nSELECT PL.ProductID, PL.Name, PL.OrderDate,\r\n   DATEDIFF(DAY,IGPD.PrevOrderDate,PL.OrderDate) AS DaysBetweenOrders\r\nFROM #ProductList AS PL \r\nOUTER APPLY [dbo].[ITVF_GetPrevDate] (PL.ProductID,PL.OrderDate) IGPD\r\nORDER BY PL.ProductID, PL.OrderDate;<\/pre>\n<p>The ITVF takes about 12 seconds to run with over 59 million logical reads when using the temp table.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-92383\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/09\/a-picture-containing-text-description-automatical.png\" alt=\"Execution plan for ITVF and temp table\" width=\"1068\" height=\"157\" \/><\/p>\n<p><strong>Figure 8. The logical reads for the ITVF<\/strong><\/p>\n<h2>Cursor<\/h2>\n<p>I debated whether to include a section on cursors because I don\u2019t want to encourage anyone to start with a cursor solution. However, I remembered other situations when a cursor solution performed better compared to other techniques, so I decided to include it. Cursors are another tool in your T-SQL toolbox. They may be at the bottom of the box and a bit rusty from disuse, but a tool, nonetheless.<\/p>\n<p>One important thing to note about running the cursor solution is to turn off the Actual Execution Plan (or any other method you might be using to capture execution plans) and Statistics. With those turned off, the script took about 2 seconds to run!<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">--Cursor\r\n--Important! Also turn off Actual Execution Plan\r\nSET STATISTICS IO, TIME OFF\r\nGO\r\nALTER TABLE #ProductList\r\nADD DaysBetweenOrders INT;\r\nGO\r\nDECLARE @LastProductID INT, @ProductID INT; \r\nDECLARE @LastOrderDate DATETIME, @OrderDate DATETIME;\r\nDECLARE @DaysBetweenOrders INT;\r\nDECLARE Products CURSOR FAST_FORWARD FOR\r\n\tSELECT  p.ProductID, P.OrderDate\r\n\tFROM #ProductList AS P\r\n\tORDER BY p.ProductID, P.OrderDate\r\n\t;\t\r\nOPEN Products;\r\nFETCH NEXT FROM Products INTO @ProductID, @OrderDate;\r\nWHILE @@FETCH_STATUS = 0 \r\nBEGIN\r\n   IF @LastProductID = @ProductID BEGIN\r\n      SET @DaysBetweenOrders = DATEDIFF(DAY,@LastOrderDate,@OrderDate);\r\n      UPDATE #ProductList SET DaysBetweenOrders = @DaysBetweenOrders\r\n\tWHERE ProductID = @ProductID AND OrderDate =@OrderDate;\r\n   END;\r\n   SELECT @LastOrderDate = @OrderDate, @LastProductID = @ProductID;\r\n   FETCH NEXT FROM Products INTO @ProductID, @OrderDate;\r\nEND\r\nCLOSE Products; \r\nDEALLOCATE Products;\r\nSELECT P.ProductID\r\n     , P.Name\r\n     , P.OrderDate\r\n     , P.DaysBetweenOrders \r\nFROM #ProductList AS P\r\nORDER BY P.ProductID, P.OrderDate;<\/pre>\n<h2>How LAG compares to other techniques?<\/h2>\n<p>There are probably even more ways to write the query (once someone in a presentation insisted that a view would always outperform <code>LAG<\/code>), but it\u2019s not likely that any other method runs faster than <code>LAG <\/code>when you need a column from the previous row. Here are the results for each technique:<\/p>\n<table>\n<tbody>\n<tr>\n<td>\n<p><strong>Technique<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>Indexed temp table?<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>Time<\/strong><\/p>\n<\/td>\n<td>\n<p><strong>Logical reads<\/strong><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>LAG<\/p>\n<\/td>\n<td>\n<p>No<\/p>\n<\/td>\n<td>\n<p>300 ms<\/p>\n<\/td>\n<td>\n<p>365<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Self-join<\/p>\n<\/td>\n<td>\n<p>No<\/p>\n<\/td>\n<td>\n<p>20 sec<\/p>\n<\/td>\n<td>\n<p>3,103<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Derived table<\/p>\n<\/td>\n<td>\n<p>No<\/p>\n<\/td>\n<td>\n<p>2 sec<\/p>\n<\/td>\n<td>\n<p>127,723<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Common table expression<\/p>\n<\/td>\n<td>\n<p>No<\/p>\n<\/td>\n<td>\n<p>2 sec<\/p>\n<\/td>\n<td>\n<p>127,730<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>OUTER APPLY<\/p>\n<\/td>\n<td>\n<p>No<\/p>\n<\/td>\n<td>\n<p>12 sec<\/p>\n<\/td>\n<td>\n<p>86,281,577<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>OUTER APPLY<\/p>\n<\/td>\n<td>\n<p>Yes<\/p>\n<\/td>\n<td>\n<p>700 ms<\/p>\n<\/td>\n<td>\n<p>57,358<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Self-join<\/p>\n<\/td>\n<td>\n<p>Yes<\/p>\n<\/td>\n<td>\n<p>3 sec<\/p>\n<\/td>\n<td>\n<p>452<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Scalar UDF<\/p>\n<\/td>\n<td>\n<p>Yes<\/p>\n<\/td>\n<td>\n<p>Killed the query<\/p>\n<\/td>\n<td>\n<p>Unknown<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Inline table-valued function<\/p>\n<\/td>\n<td>\n<p>Yes<\/p>\n<\/td>\n<td>\n<p>12 sec<\/p>\n<\/td>\n<td>\n<p>59,622,091<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p>Cursor<\/p>\n<\/td>\n<td>\n<p>Yes<\/p>\n<\/td>\n<td>\n<p>2 sec<\/p>\n<\/td>\n<td>\n<p>Unknown<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As long as <code>OUTER<\/code> <code>APPLY<\/code> had the pre-aggregated temp table to work with, it performed almost as well as <code>LAG<\/code>. Otherwise, the other methods ran in 2 seconds or more.<\/p>\n<p>For this specific problem, <code>LAG<\/code> performed the best. The lesson to learn here is that there are many ways to write a query, so try other techniques when you experience issues with performance.<\/p>\n<p><em>If you like this article, you might also like\u00a0<a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/introduction-to-t-sql-window-functions\/\">Introduction to T-SQL Window Functions<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>LAG pulls a column from another row without a self-join. In this article, Kathi Kellenberger shows how LAG compares to other techniques&hellip;<\/p>\n","protected":false},"author":110218,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143531],"tags":[145477,5842,145478],"coauthors":[11292],"class_list":["post-92375","post","type-post","status-publish","format-standard","hentry","category-featured","category-t-sql-programming-sql-server","tag-lag-and-lead","tag-sql-monitor","tag-table-valued-function"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92375","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/110218"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=92375"}],"version-history":[{"count":7,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92375\/revisions"}],"predecessor-version":[{"id":93162,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92375\/revisions\/93162"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=92375"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=92375"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=92375"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=92375"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}