{"id":79311,"date":"2018-06-13T16:01:12","date_gmt":"2018-06-13T16:01:12","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=79311"},"modified":"2021-09-29T16:21:01","modified_gmt":"2021-09-29T16:21:01","slug":"introduction-to-t-sql-window-functions","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/introduction-to-t-sql-window-functions\/","title":{"rendered":"Introduction to T-SQL Window Functions"},"content":{"rendered":"<p>I\u2019ve spent the past six years traveling around the US telling database professionals about T-SQL Window Functions at SQL Saturdays and other events. I\u2019m amazed at how few people have heard about these functions and even fewer who are using them. At the end of each presentation, one or more people come up to say that they wished they learned about these functions years earlier because they could have been beneficial for so many queries.<\/p>\n<p>These functions have been promoted to improve performance over other, more traditional methods. I partially agree. They make many queries easier to write, and, sometimes, they improve performance.<\/p>\n<h2>Nothing to do with the Windows OS<\/h2>\n<p>These functions are part of the ANSI SQL 2003 Standards and, in the case of SQL Server, are T-SQL functions used to write queries. They have nothing to do with the Windows operating system or any API calls. Other database systems, such as Oracle, have also included these as part of their own SQL language.<\/p>\n<p>Window (also, windowing or windowed) functions perform a calculation over a set of rows. I like to think of \u201clooking through the window\u201d at the rows that are being returned and having one last chance to perform a calculation. The window is defined by the <strong>OVER<\/strong> clause which determines if the rows are partitioned into smaller sets and if they are ordered. In fact, if you use a window function you will always use an <strong>OVER<\/strong> clause. The <strong>OVER<\/strong> clause is also part of the <strong>NEXT VALUE FOR<\/strong> syntax required for the sequence object, but, otherwise it\u2019s used with window functions.<\/p>\n<p>The <strong>OVER<\/strong> clause may contain a <strong>PARTITION BY<\/strong> option. This breaks the rows into smaller sets. You might think that this is the same as <strong>GROUP BY<\/strong>, but it\u2019s not. When grouping, one row per unique group is returned. When using <strong>PARTITION BY<\/strong>, all of the detail rows are returned along with the calculations. If you have a window in your home that is divided into panes, each pane is a window. When thinking about window functions, the entire set of results is a partition, but when using <strong>PARTITION BY<\/strong>, each partition can also be considered a window. <strong>PARTITION BY<\/strong> is supported \u2013 and optional \u2013 for all windowing functions.<\/p>\n<p>The <strong>OVER<\/strong> clause may also contain an <strong>ORDER BY<\/strong> option. This is independent of the <strong>ORDER BY<\/strong> clause of the query. Some of the functions require <strong>ORDER BY<\/strong>, and it\u2019s not supported by the others. When the order of the rows is important when applying the calculation, the <strong>ORDER BY<\/strong> is required.<\/p>\n<p>Window functions may be used only in the <strong>SELECT<\/strong> and <strong>ORDER BY<\/strong> clauses of a query. They are applied after any joining, filtering, or grouping.<\/p>\n<h2>Ranking Functions<\/h2>\n<p>The most commonly used window functions, ranking functions, have been available since 2005. That\u2019s when Microsoft introduced <strong>ROW_NUMBER<\/strong>, <strong>RANK<\/strong>, <strong>DENSE_RANK<\/strong>, and <strong>NTILE<\/strong>. <strong>ROW_NUMBER<\/strong> is used very frequently, to add unique row numbers to a partition or to the entire result set. Adding a row number, or one of the other ranking functions, is not usually the goal, but it is a step along the way to the solution.<\/p>\n<p><strong>ORDER BY<\/strong> is required in the <strong>OVER<\/strong> clause when using <strong>ROW_NUMBER<\/strong> and the other functions in this group. This tells the database engine the order in which the numbers should be applied. If the values of the columns or expressions used in the <strong>ORDER BY<\/strong> are not unique, then <strong>RANK<\/strong> and <strong>DENSE_RANK<\/strong> will deal with the ties, while <strong>ROW_NUMBER<\/strong> doesn\u2019t care about ties. <strong>NTILE<\/strong> is used to divide the rows into buckets based on the <strong>ORDER BY<\/strong>.<\/p>\n<p>One benefit of <strong>ROW_NUMBER<\/strong> is the ability to turn non-unique rows into unique rows. This could be used to eliminate duplicate rows, for example.<\/p>\n<p>To show how this works, start with a temp table containing duplicate rows. The first step is to create the table and populate it.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">CREATE TABLE #Duplicates(Col1 INT, Col2 CHAR(1));\r\nINSERT INTO #Duplicates(Col1, Col2) \r\nVALUES(1,'A'),(2,'B'),(2,'B'),(2,'B'),\r\n\t(3,'C'),(4,'D'),(4,'D'),(5,'E'),\r\n\t(5,'E'),(5,'E');\r\nSELECT * FROM #Duplicates;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"302\" height=\"553\" class=\"wp-image-79312\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-71.png\" \/><\/p>\n<p>Adding <strong>ROW_NUMBER<\/strong> and partitioning by each column will restart the row numbers for each unique set of rows. You can identify the unique rows by finding those with a row number equal to one.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT Col1, Col2, \r\n   ROW_NUMBER() OVER(PARTITION BY Col1, Col2 ORDER BY Col1) AS RowNum\r\nFROM #Duplicates;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"476\" height=\"556\" class=\"wp-image-79313\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-72.png\" \/><\/p>\n<p>Now, all you have to do is to delete any rows that have a row number greater than one. The problem is that you cannot add window functions to the <strong>WHERE<\/strong> clause.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">DELETE #Duplicates \r\nWHERE ROW_NUMBER() OVER(PARTITION BY Col1, Col2 ORDER BY Col1) &lt;&gt; 1;<\/pre>\n<p>You\u2019ll see this error message:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"138\" class=\"wp-image-79314\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-73.png\" \/><\/p>\n<p>The way around this problem is to separate the logic using a common table expression (CTE). You can then delete the rows right from the CTE.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">WITH Dupes AS (\r\n   SELECT Col1, Col2, \r\n     ROW_NUMBER() OVER(PARTITION BY Col1, Col2 ORDER BY Col1) AS RowNum\r\n   FROM #Duplicates)\r\nDELETE Dupes \r\nWHERE RowNum &lt;&gt; 1;\r\nSELECT * FROM #Duplicates;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"311\" height=\"331\" class=\"wp-image-79315\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-74.png\" \/><\/p>\n<p>Success! The extra rows were deleted, and a unique set of rows remains.<\/p>\n<p>To see the difference between <strong>ROW_NUMBER<\/strong>, <strong>RANK<\/strong>, and <strong>DENSE_RANK<\/strong>, run this query:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">USE Adventureworks2017; --Or whichever version you have\r\nGO\r\nSELECT SalesOrderID, OrderDate, CustomerID, \r\n\tROW_NUMBER() OVER(ORDER BY OrderDate) As RowNum,\r\n\tRANK() OVER(ORDER BY OrderDate) As Rnk,\r\n\tDENSE_RANK() OVER(ORDER BY OrderDate) As DenseRnk\r\nFROM Sales.SalesOrderHeader\r\nWHERE CustomerID = 11330;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1288\" height=\"618\" class=\"wp-image-79316\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-75.png\" \/><\/p>\n<p>The <strong>ORDER BY<\/strong> for each <strong>OVER<\/strong> clause is <strong>OrderDate<\/strong> which is not unique. This customer placed two orders on 2013-10-24. <strong>ROW_NUMBER<\/strong> just continued assigning numbers and didn\u2019t do anything different even though there is a duplicate date. <strong>RANK<\/strong> assigned 6 to both rows and then caught up to <strong>ROW_NUMBER<\/strong> with an 8 on the next row. <strong>DENSE_RANK<\/strong> also assigned 6 to the two rows but assigned 7 to the following row.<\/p>\n<p>Two explain the difference, think of <strong>ROW_NUMBER<\/strong> as <em>positional<\/em>. <strong>RANK<\/strong> is both <em>positional<\/em> and <em>logical<\/em>. Those two rows are ranked logically the same, but the next row is ranked by the position in the set. <strong>DENSE_RANK<\/strong> ranks them <em>logically<\/em>. Order 2013-11-04 is the 7<sup>th<\/sup> unique date.<\/p>\n<p>The final function in this group is called <strong>NTILE<\/strong>. It assigns bucket numbers to the rows instead of row numbers or ranks. Here is an example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT SP.FirstName, SP.LastName,\r\n\tSUM(SOH.TotalDue) AS TotalSales, \r\n\tNTILE(4) OVER(ORDER BY SUM(SOH.TotalDue)) * 1000 AS Bonus\r\nFROM [Sales].[vSalesPerson] SP \r\nJOIN Sales.SalesOrderHeader SOH \r\n     ON SP.BusinessEntityID = SOH.SalesPersonID \r\nWHERE SOH.OrderDate &gt;= '2012-01-01' AND SOH.OrderDate &lt; '2013-01-01'\r\nGROUP BY FirstName, LastName;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"576\" height=\"477\" class=\"wp-image-79317\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-76.png\" \/><\/p>\n<p><strong>NTILE<\/strong> has a parameter, in this case 4, which is the number of buckets you want to see in the results. The <strong>ORDER BY<\/strong> is applied to the sum of the sales. The rows with the lowest 25% are assigned 1, the rows with the highest 25% are assigned 4. Finally, the results of <strong>NTILE<\/strong> are multiplied by 1000 to come up with the bonus amount. Since 14 cannot be evenly divided by 4, an extra row goes into each of the first two buckets.<\/p>\n<h2>Window Aggregates<\/h2>\n<p>Window aggregates were also introduced with SQL Server 2005. These make writing some tricky queries easy but will often perform worse than older techniques. They allow you to add your favourite aggregate function to a non-aggregate query. Say, for example you would like to display all the customer orders along with the subtotal for each customer. By adding a <strong>SUM<\/strong> using the <strong>OVER<\/strong> clause, you can accomplish this very easily:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT CustomerID, OrderDate, SalesOrderID, TotalDue, \r\n\tSUM(TotalDue) OVER(PARTITION BY CustomerID) AS SubTotal \r\nFROM Sales.SalesOrderHeader;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"242\" class=\"wp-image-79318\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-77.png\" \/><\/p>\n<p>By adding the <strong>PARTITION BY<\/strong>, a subtotal is calculated for each customer. Any aggregate function can be used, and <strong>ORDER BY<\/strong> in the <strong>OVER<\/strong> clause is not supported.<\/p>\n<h2>Window Aggregate Enhancements in 2012<\/h2>\n<p>Beginning with 2012, you can add an <strong>ORDER BY<\/strong> to the <strong>OVER<\/strong> clause to window aggregates to produce running totals and moving averages, for example. At the same time, Microsoft introduced the concept of framing. Adding a <strong>PARTITION BY<\/strong> is like dividing a window into panes. Adding framing is like creating a stained-glass window. Each row has an individual window where the expression will be applied.<\/p>\n<p>With this enhancement, you can create running totals even without adding the framing syntax. Here is an example that returns a running total by customer:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT CustomerID, OrderDate, SalesOrderID, TotalDue, \r\n   SUM(TotalDue) OVER(PARTITION BY CustomerID ORDER BY SalesOrderID) \r\n      AS RunningTotal \r\nFROM Sales.SalesOrderHeader;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1389\" height=\"513\" class=\"wp-image-79319\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-78.png\" \/><\/p>\n<p>The default frame, which is used if a frame is not specified, is <strong>RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW<\/strong>. Unfortunately, this will not perform as well as if you specify this frame instead: <strong>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW<\/strong>. The difference is the word <strong>ROWS<\/strong>. <strong>RANGE<\/strong> is only partially implemented at this time, and it\u2019s meant for working with periods of time, while <strong>ROWS<\/strong> is positional. The frame, <strong>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW<\/strong>, means that the window consists of the first row of the partition and all the rows up to the current row. Each calculation is done over a different set of rows. For example, when performing the calculation for row 4, the rows 1 to 4 are used.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1215\" height=\"206\" class=\"wp-image-79320\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-79.png\" \/><\/p>\n<p>When performing the calculation for row 5, the rows are 1 to 5. The window grows larger as you move from one row to the next.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-79321\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-80.png\" width=\"1178\" height=\"178\" \/><\/p>\n<p>You can also use the syntax <strong>ROWS BETWEEN N PRECEEDING AND CURRENT ROW<\/strong> or <strong>ROWS BETWEEN CURRENT ROW AND N FOLLOWING<\/strong>. This could be useful for calculating a three-month moving average, for example. The following figure represents <strong>ROWS BETWEEN 2 PRECEDING AND CURRENT ROW<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1207\" height=\"170\" class=\"wp-image-79322\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-81.png\" \/><\/p>\n<p>When 5 is the current row, the window moves; it doesn\u2019t change size.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1203\" height=\"141\" class=\"wp-image-79323\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-82.png\" \/><\/p>\n<p>Here is the list of terms you need to know when writing the framing option:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1173\" height=\"385\" class=\"wp-image-79324\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-83.png\" \/><\/p>\n<p>I admit that this syntax is a bit confusing but using <a href=\"https:\/\/www.red-gate.com\/products\/sql-development\/sql-prompt\/\">SQL Prompt<\/a> helps makes writing the framing option easier!<\/p>\n<h2>Offset Functions<\/h2>\n<p>Also included with the release of SQL Server 2012 are four functions that allow you to include values from other rows \u2013 without doing a self-join. Microsoft calls these \u2018analytic functions\u2019, but I always refer to them as \u2018offset functions\u2019 when presenting on this topic. Two of the functions allow you to pull columns or expressions from a row before (<strong>LAG<\/strong>) or after (<strong>LEAD<\/strong>) the current row. The other two functions allow you to return values from the first row of the partition (<strong>FIRST_VALUE<\/strong>) or last row of the partition (<strong>LAST_VALUE<\/strong>). <strong>FIRST_VALUE<\/strong> and <strong>LAST_VALUE<\/strong> also require framing, so be sure to include the frame when using these functions. All four of the functions require the <strong>ORDER BY<\/strong> option of the <strong>OVER<\/strong> clause. That makes sense, because the database engine must know the order of the rows to figure out which row contains the value to return.<\/p>\n<p>Some people have a favourite band; some people have a favourite movie. I have a favourite function \u2013 <strong>LAG<\/strong>. It\u2019s easy to use (no frame!) and performs great. Here is an example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \"> SELECT CustomerID, OrderDate, SalesOrderID, \r\n   LAG(SalesOrderID) OVER(PARTITION BY CustomerID ORDER BY SalesOrderID\r\n   ) AS PrevOrder\r\nFROM Sales.SalesOrderHeader\r\nORDER BY CustomerID;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1031\" height=\"457\" class=\"wp-image-79325\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-84.png\" \/><\/p>\n<p><strong>LAG<\/strong> and <strong>LEAD<\/strong> require an argument \u2013 the column or expression you want to return. By default, <strong>LAG<\/strong> returns the value from the previous row, and <strong>LEAD<\/strong> returns the value from the following row. You can modify that by supplying a value for the <strong>OFFSET<\/strong> parameter, which is 1 by default. Notice that the first row of the partition returns <strong>NULL<\/strong>. If you wish to override the <strong>NULL<\/strong>s, you can supply a <strong>DEFAULT<\/strong> value. Here is a similar query that goes back two rows and has a default value:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT CustomerID, OrderDate, SalesOrderID, \r\n   LAG(SalesOrderID,2,0) OVER(PARTITION BY CustomerID \r\n    ORDER BY SalesOrderID) AS Back2Orders\r\nFROM Sales.SalesOrderHeader;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1066\" height=\"474\" class=\"wp-image-79326\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-85.png\" \/><\/p>\n<p><strong>FIRST_VALUE<\/strong> and <strong>LAST_VALUE<\/strong> can be used to find a value from the very first row or very last row of the partition. Be sure to specify the frame, not only for performance reasons, but because the default frame doesn\u2019t work as you would expect with <strong>LAST_VALUE<\/strong>. The default frame, <strong>RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW<\/strong>, only goes up to the current row. The last row of the partition is not included. To get the expected results, be sure to specify<strong> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING<\/strong> when using <strong>LAST_VALUE<\/strong>. Here is an example using <strong>FIRST_VALUE<\/strong>:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT CustomerID, OrderDate, SalesOrderID, \r\n   FIRST_VALUE(SalesOrderID) OVER(PARTITION BY CustomerID \r\n      ORDER BY SalesOrderID\r\n   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS FirstOrder\r\nFROM Sales.SalesOrderHeader;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1026\" height=\"473\" class=\"wp-image-79327\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-86.png\" \/><\/p>\n<h2>Statistical Functions<\/h2>\n<p>Microsoft groups these four functions \u2013 <strong>PERCENT_RANK<\/strong>, <strong>CUME_DIST<\/strong>, <strong>PERCENTILE_DISC<\/strong>, <strong>PERCENTILE_CONT<\/strong> \u2013 along with the offset functions calling all eight the analytic functions. Since I like to distinguish these from the offset functions, I call these statistical.<\/p>\n<p><strong>PERCENT_RANK<\/strong> and <strong>CUME_DIST<\/strong> provide a ranking for each row over a partition. They differ slightly. <strong>PERCENT_RANK<\/strong> returns the percentage of rows that rank lower than the current row. \u201cMy score is higher than 90% of the scores.\u201d <strong>CUME_DIST<\/strong>, or cumulative distribution, returns the exact rank. \u201cMy score is at 90% of the scores.\u201d Here is an example using the average high temperature in St. Louis for each month. Note that the ranks were determined by the Fahrenheit temperature.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">CREATE TABLE #MonthlyTempsStl(MName varchar(15), AvgHighTempF INT, \r\n     AvgHighTempC INT)\r\nINSERT INTO #MonthlyTempsStl(MName, AvgHighTempF, AvgHighTempC)\r\nVALUES('Jan',40,4),('Feb',45, 7),('Mar',56, 13),('Apr',67, 20),\r\n\t  ('May',76,25),('Jun',85,30),('Jul',89,32),('Aug',88,31),\r\n\t  ('Sep',80,27),('Oct',69,20),('Nov',56,13),('Dec',43,6);\r\nSELECT MName, AvgHighTempF,AvgHighTempC, \r\n   RANK() OVER(ORDER BY AvgHighTempF) AS Rank,\r\n   PERCENT_RANK() OVER(ORDER BY AvgHighTempF)  AS PercentRank,\r\n   CUME_DIST() OVER(ORDER BY AvgHighTempF)  AS CumeDist\r\nFROM #MonthlyTempsStl;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1495\" height=\"623\" class=\"wp-image-79328\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-87.png\" \/><\/p>\n<p>The ranks are not determined by the relative values, but by the positions of the rows. Notice that March and November have the same average high temp, so they were ranked the same.<\/p>\n<p>You may be wondering how to calculate <strong>PERCENT_RANK<\/strong> and <strong>CUME_DIST<\/strong>. Here are the formulas:<\/p>\n<pre class=\"lang:none theme:none\">PERCENT_RANK = (Rank -1)\/(Row count -1)\r\nCUME_DIST = (Rank)\/(Row count)<\/pre>\n<p><strong>PERCENTILE_DISC<\/strong> and <strong>PERCENTILE_CONT<\/strong> work in the opposite way. Given a percent rank, find the value at that rank. They differ in that <strong>PERCENTILE_DISC<\/strong> will return a value that exists in the set while <strong>PERCENTILE_CONT<\/strong> will calculate an exact value if none of the values in the set falls precisely at that rank. You can use <strong>PERCENTILE_CONT<\/strong> to calculate a median by supplying 0.5 as the percent rank. For example, which temperature ranks at 50% in St. Louis?<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">SELECT MName, AvgHighTempF,AvgHighTempC,\r\n   RANK() OVER(ORDER BY AvgHighTempF) AS Rank,\r\n   PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY AvgHighTempF) \r\n      OVER() AS PercentileCont,\r\n   PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY AvgHighTempF) \r\n      OVER() AS PercentileDisc\r\nFROM #MonthlyTempsStl;<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1278\" height=\"616\" class=\"wp-image-79329\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/06\/word-image-88.png\" \/><\/p>\n<p>The <strong>PERCENTILE_CONT<\/strong> function takes the average of the two values closest to the middle, 67 and 69, and averages them. <strong>PERCENTILE_DISC<\/strong> returns an exact value, 67. Also notice that these two functions have an extra clause not seen in the other functions, <strong>WITHIN GROUP<\/strong>, that contains the <strong>ORDER BY<\/strong> instead of within the <strong>OVER<\/strong> clause.<\/p>\n<h2>Summary<\/h2>\n<p>This article is a very quick overview of T-SQL window functions. Two types of functions were released with SQL Server 2005, the ranking functions and window aggregates. With 2012, you have enhanced window aggregate with framing and the analytic functions. I like to separate the analytic functions into two groups, the offset and statistical functions. Window functions make many queries easier to write, and I believe that is the main benefit. In some cases, the queries will perform better, too, but that is a discussion for another day.<\/p>\n<p>I hope this article has inspired you to learn more about these fantastic functions!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>T-SQL window functions were introduced in 2005 with more functionality added in 2012. Many database professionals are not aware of these useful functions. In this article, Kathi Kellenberger provides a quick overview of just what a window function is as well as examples of each type of function.&hellip;<\/p>\n","protected":false},"author":110218,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[5134],"coauthors":[11292],"class_list":["post-79311","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-sql-prompt"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79311","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/110218"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=79311"}],"version-history":[{"count":11,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79311\/revisions"}],"predecessor-version":[{"id":83359,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/79311\/revisions\/83359"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=79311"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=79311"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=79311"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=79311"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}