T-SQL Window Function Speed Phreakery: The FIFO Stock Inventory Problem

Back in 2009 and 2010, Phil Factor hosted the SQL Speed Phreak Competition, in which programmers were challenged to construct the fastest solution to a series of common programming challenges. Unfailingly, the best solutions used set-based T-SQL techniques to achieve lightning-fast execution times, even when the data load scaled to millions of rows. Although fast, the solutions were often complex and hard to understand, and occasionally used slightly unorthodox techniques.

The Simple-Talk editors called in “Aunt Kathi” to help. My job was to explain as simply as I could the mechanics of these solutions, and so make them more accessible to developers who faced similar business problems, and needed a faster and more scalable solution than more conventional “row by row” processing techniques could offer. In each case, I also tried to offer an alternative solution that “learned” from the winning solution, but was simpler and therefore easier to maintain, but inevitably compromised a little on speed.

Skip forward a few years, and SQL Server 2012 has now introduced many powerful new T-SQL window functions and enhancements. Could they offer a simpler alternative to the original solutions that would match or perhaps even beat them on speed?

My previous article revisited the running total problem, where I proved that a solution that used the OVER clause and a ‘sliding window’ of data, with some aggregate functions, performed identically to the original solution, which relied on the unorthodox “query update” technique.

This article revisits the ‘FIFO (First In, First Out) Stock Inventory Problem’, a somewhat-simplified, but still realistic inventory accounting problem, where stock items are either purchased, sold, or returned, and where the oldest items in the queue of stock items (those that were the first to be purchased or returned) must be the first items sold. The requirement is to calculate the remaining count, and value, of each item after all stock transactions are performed.

The original winning solution from Dave Ballantyne is brilliant, but a brain twister for the uninitiated. It uses one statement comprised of several common table expressions (CTEs) and includes a couple of CROSS APPLY type joins. Was there a simpler but equally-speedy window function-based solution? Let’s find out.

Understanding the FIFO Stock Inventory Problem

In this puzzle, stock items are purchased ( IN), returned ( RET), and sold ( OUT). Each type of stock item is identified by an ArticleID. Each movement of stock in or out of the warehouse, due to a purchase, sale or return of a given type of stock, results in a row being added to the Stock table, uniquely identified by a value in the StockID identity column, and describing how many items of that type of stock were added or removed, the price for purchases, the date of the stock transaction, and so on.

The objective is to calculate the current value of the stock, by ArticleID, after all transactions are complete. Prices are available only when items are purchased. If an item is returned, the most recent purchase price is used. Items are sold on a “first in, first out” basis, and the price to use is the price paid when the item was added to the inventory.

The following table shows an extract of data from the Stock table, all relating to a single type of stock, and hopefully illustrating how the rules work.

ArticleID	TranDate	TranCode	Items	Price	Current Count	Current Value	What Happened?
1001	2016-01-01	IN	10	$100	10	$1000	Added 10 at $100
1001	2016-01-02	IN	30	$101	40	$4030	Added 30 at $101
1001	2016-01-03	OUT	25	—	15	$1515	Added 10 at $100 and 15 at $101
1001	2016-01-04	RET	10	—	25	$2525	Added 10 at $101
1001	2016-01-05	IN	9	$99	34	$3416	Added 9 at $99
1001	2016-01-06	OUT	22	—	12	$1194	Sold 22 at $101

The key to solving this puzzle is understanding that what is left after the transactions are complete is all that matters. You do not need to step through the transactions or even calculate the price at which items were sold.

I like to compare this puzzle to the common habit of cleaning out the spare change from one’s pockets at the end of each day. Instead of a change jar, imagine that you had a change queue on the dresser. Each night, you add the day’s change to the end of the queue. The next morning, you take some coins from the front of the queue if you think you would need them that day. If you want to know the current value of the coins, you only have to look at the existing coins in the queue to calculate the value. You do not need to know about the coins that have been previously removed. You only need to look at each remaining coin, figure out the value of each one, and add them up.

The FIFO Stock Inventory Problem is quite a bit more complex than coins on a dresser, but you still must concentrate on how many of each ArticleID are left and the price when each item was added to stock. Here are the steps to solving the puzzle:

Calculate a final count of the items by ArticleID. This is easily done with an aggregate query. At the end of the transactions, there are 12 items left of ArticleID 1001 in the example above.
Use a reverse running total of purchases and returns to figure out which transactions are responsible for the remaining items. A reverse running total is used because the purchases and returns at the end are responsible for the remaining stock.
Find the date at which the reverse running total equals or exceeds the amount remaining at the end of all transactions. On 2016-01-04 the reverse running total, 19, exceeded the number remaining, 12.
Sum the items from all transactions after that date, in this case the transaction on 2016-01-06, which was 9 items.
On the date calculated in step 3, figure out how many of the items added to stock should be counted. This is the difference between the items remaining (12) and the previous running total (9). One way to calculate the previous reverse running total is by subtracting the items purchased or returned from the current reverse running total. Then the formula becomes Items Remaining – (Reverse Running Total – Items Purchased or Returned). For this example, that would be 12 – (19 – 10) which resolves to 3.
Multiply remaining items by the price. For returned items, the price must be calculated. In the example above, the ten items returned are priced at $101 because that was the last purchase price.

ArticleID	TrandDate	TranCode	Items	Price	Reverse Running Total	Source of Remaining Items
1001	2016-01-01	IN	10	$100	59
1001	2016-01-02	IN	30	$101	49
1001	2016-01-04	RET	10	$101	19	3
1001	2016-01-05	IN	9	$99	9	9

Dave Ballantyne’s Original Winning Solution

Listing 1 shows Dave Ballantyne’s original solution. As part of his solution, he added several indexes, which was not against the rules. Also, evidently, creating the indexes didn’t count against the time to run the solution. On my laptop with 8 GB RAM, 4 procs and an SSD drive, the query took 1 second to run.

CREATENONCLUSTERED INDEX IX_Dave_General

ON dbo.Stock

( ArticleIDASC,

TranDateDESC,

TranCodeASC

)

INCLUDE ( Items, Price);

CREATENONCLUSTERED INDEX IX_Dave_Items

ON dbo.Stock

( ArticleID ASC,

TranDate ASC )

INCLUDE (Items)

WHERE (TranCode IN('IN','RET'));

CREATENONCLUSTERED INDEX IX_Dave_Price

ON dbo.Stock

( ArticleID ASC,

TranDate ASC

)

INCLUDE ( Price)

WHERE (TranCode='IN');

WITH cteStockSum

AS (SELECT ArticleID ,

SUM(CASEWHEN TranCode ='OUT' THEN 0 - Items

ELSE Items

END)AS TotalStock

FROM dbo.Stock

GROUP BY ArticleID

cteReverseInSum

AS (SELECT s.ArticleID,

s.TranDate,

( SELECT SUM(i.Items)

FROM dbo.StockAS i WITH( INDEX( IX_Dave_Items) )

WHERE i.ArticleID= s.ArticleID

AND i.TranCodeIN ( 'IN','RET' )

AND i.TranDate>= s.TranDate

) AS RollingStock,

s.Items AS ThisStock

FROM dbo.StockAS s

WHERE s.TranCodeIN ( 'IN','RET' )

/* Using the rolling balance above find the first stock movement in that meets (or exceeds) our required stock level */

/* and calculate how much stock is required from the earliest stock in */

cteWithLastTranDate

AS (SELECT w.ArticleID,

w.TotalStock,

LastPartialStock.TranDate,

LastPartialStock.StockToUse,

LastPartialStock.RunningTotal,

w.TotalStock- LastPartialStock.RunningTotal

+ LastPartialStock.StockToUseAS UseThisStock

FROM cteStockSum AS w

CROSS APPLY ( SELECTTOP ( 1)

z.TranDate,

z.ThisStockAS StockToUse ,

z.RollingStockAS RunningTotal

FROM cteReverseInSum AS z

WHERE z.ArticleID= w.ArticleID

AND z.RollingStock>= w.TotalStock

ORDER BY z.TranDateDESC

) AS LastPartialStock

)

/* Sum up the cost of 100% of the stock movements in after the returned stockid and for that stockid we need 'UseThisStock' items' */

SELECT y.ArticleID,

y.TotalStockAS CurrentItems ,

SUM(CASEWHEN e.TranDate= y.TranDateTHEN y.UseThisStock

ELSE e.Items

END * Price.Price)AS CurrentValue

FROM cteWithLastTranDate AS y

INNER JOIN dbo.StockAS e WITH( INDEX( IX_Dave_Items) )

ON e.ArticleID= y.ArticleID

AND e.TranDate>= y.TranDate

AND e.TranCodeIN ( 'IN','RET' )

CROSS APPLY (

/* Find the Price of the item in */SELECT TOP( 1 )

p.Price

FROM dbo.StockAS p WITH

(INDEX ( IX_Dave_Price ) )

WHERE p.ArticleID= e.ArticleID

AND p.TranDate<= e.TranDate

AND p.TranCode= 'IN'

ORDER BY p.TranDate DESC

) AS Price

GROUP BY y.ArticleID ,

y.TotalStock

ORDER BY y.ArticleID;

Listing 1: Dave Ballantyne’s winning code

You can find a complete description of this solution in my original article, but I’ll summarize the highlights. The solution starts with an aggregate query in cteStockSum, calculating the total remaining stock ( TotalStock) of each ArticleID, after all transactions have completed. It then uses a correlated sub-query to find the reverse running total of articles purchased or returned, in cteReverseInSum.

Having calculated the final count and the reverse running total, you can find the date where the number of items at the end of the queue meets or exceeds the TotalStock amount. This is done in the cteLastTranDte CTE, using CROSS APPLY and TOP( 1). On that date, you may need all of the items or just some of the items.

By using CROSS APPLY, all columns from the row can be returned. Dave takes advantage of this and figures out how many items are needed on that date. In the outer query, the value for each row is calculated by multiplying the items by the price. The query in the CROSS APPLY returns the price for the latest purchase, which is then the price applied for any the returns in the remaining stock.

The Aunt Kathi Window Function Method

There is no one tool for the job when working with T-SQL, and using windowing functions is not an exception to this rule. I systematically tried replacing each of Dave’s calculations with the equivalent calculation using window function, but occasionally the performance suffered badly as a result.

Finally, I settled on the code shown in Listing 2, which is sort of a hybrid approach. On my laptop, the query took 2 seconds to run. I was never able to match the 1-second execution time of Dave’s original code.

WITH ItemEndTotal

AS (SELECT ArticleID ,

SUM(CASEWHEN TranCode ='OUT' THEN-Items

ELSE Items

END)AS FinalCount

FROM Stock

GROUP BY ArticleID

ReverseRunningTotal

AS (SELECT StockID ,

ArticleID ,

TranCode ,

TranDate ,

Items ,

Price ,

SUM(Items)OVER (PARTITION BY ArticleID ORDER BY TranDate

ROWS BETWEEN CURRENTROW ANDUNBOUNDED FOLLOWING ) ReverseTotal

FROM Stock

WHERE TranCode IN( 'IN','RET' )

FindDate

AS (SELECT DISTINCT

T.ArticleID,

FinalCount ,

LAST_VALUE(TranDate)OVER (PARTITION BY P.ArticleID ORDER BY TranDate

ROWS BETWEEN CURRENTROW ANDUNBOUNDED FOLLOWING )AS TheDate

FROM ItemEndTotal AS T

JOIN ReverseRunningTotal AS P

ON T.ArticleID= P.ArticleID

AND P.ReverseTotal>= T.FinalCount

)

SELECT RRT.ArticleID,

FinalCount ,

SUM(CASEWHEN TheDate = TranDate

THEN FinalCount- ( ReverseTotal - Items )

ELSE Items

END * PurchasePrice)AS Value

FROM ReverseRunningTotal RRT

JOIN FindDate

ON RRT.ArticleID= FindDate.ArticleID

CROSS APPLY (SELECT TOP( 1 )

Price AS PurchasePrice

FROM ReverseRunningTotal AS R

WHERE RRT.ArticleID= R.ArticleID

AND TranCode= 'IN'

AND R.TranDate<= RRT.TranDate

ORDER BY TranDate DESC

) AS P

WHERE RRT.TranDate>= TheDate

GROUP BY RRT.ArticleID ,

FinalCount

ORDER BY RRT.ArticleID;

Listing 2: The Aunt Kathi FIFO using window functions

I started with an aggregate query in a CTE, ItemEndTotal, which provides a final count of each ArticleID. I could have eliminated the ItemEndTotal CTE and instead used a windowing aggregate function in the second CTE, as follows:

1	SUM(CASE WHEN TranCode= 'OUT'THEN -ItemsELSE Items END)OVER(PARTITIONBY P.ArticleID)

This, unfortunately, caused the performance to suffer so I abandoned that idea.

In the ReverseRunningTotal CTE, I used an accumulating window aggregate to calculate the reverse running total. If you use this technique, be sure to specify the frame ( ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) for the best performance. When running this step in isolation, the windowing function method out performs Dave’s method.

To find the date at which the reverse running total equals or exceeds the amount remaining at the end of all transactions, I used the LAST_ROW function. The LAST_ROW function allows you to access any column in the last row of a partition. I could have used the LAST_ROW function multiple times to grab all of the other columns needed from that row and then calculated how many stock items would be needed on the date. However, nesting the LAST_ROW expressions in a calculation made this part of the query extremely complex. Instead, I just performed the calculation in the outer query which ended up being much simpler. However, I believe this step is where my process took extra time over Dave’s solution. I needed to return only one row for each ArticleID, so I had to use DISTINCT. Using GROUP BY would not have worked in this situation because aggregation happens before windowing functions. Nevertheless, the performance was still acceptable, so I left this windowing function in place.

The outer query also uses a CROSS APPLY to find the prices for the returned items, which I just copied from Dave’s code. There is a way to use windowing functions to find the prices; Listing 3 shows a sample of the alternative code.

WITH GetMaxID

AS (SELECT ArticleID ,

TranDate ,

Price ,

TranCode ,

MAX(CASEWHEN TranCode ='IN' THEN StockID

END)OVER (PARTITION BY ArticleID ORDER BY TranDate

ROWS BETWEEN UNBOUNDEDPRECEDING ANDCURRENT ROW) AS MaxID

FROM Stock

WHERE TranCode IN( 'RET','IN' )

)

SELECT ArticleID ,

TranDate ,

Price ,

TranCode ,

MAX(Price)OVER (PARTITION BY ArticleID, MaxID) AS ThePrice

FROM GetMaxID;

Listing 3: An alternative way to find the price for returned items in the remaining stock

This is a technique I learned from the great Itzik Ben-Gan. In the first level, I am using MAX as an accumulating aggregate. I find the maximum StockID for purchases only up to the current row. This means that the MaxID for each return row will be the StockID of the previous purchase row. In the outer query, by partitioning by MaxID, the maximum price returned is the price I need. The technique is elegant but, in my tests, it slowed the query down quite a bit. The data had to be filtered first by the TranCode and then sorted by ArticleID and TranDate, so there is no way to create a good index for the query. Sorting is the bottleneck, and using this technique ends up adding several seconds to the overall time.

Conclusion

I still remember staring at Dave’s winning code for hours back in 2010 trying to figure out how his solution even worked. Eventually, I figured out that the stock items left, not the in and out transactions from the beginning, were what was important. In this article, I was able to apply windowing functions every step of the way. Unfortunately, I had to back out in a couple of places to get acceptable performance.

In my experience, windowing functions are great tools to help you solve complex business problems. Once you begin using them, you may find that you start coming up with set based solutions more quickly. Your queries will be easier to understand and maintain. Often, they will perform better than queries using traditional methods. Like anything else, you must test each solution to make sure that the performance is acceptable.

Register for Simple Talk

T-SQL Window Function Speed Phreakery: The FIFO Stock Inventory Problem

Understanding the FIFO Stock Inventory Problem

Dave Ballantyne’s Original Winning Solution

The Aunt Kathi Window Function Method

Conclusion

Article tags

About the author

Kathi Kellenberger

Kathi's contributions

Articles

Books

Top topics

Kathi's latest contributions:

Introduction to GitHub Actions Part 3: Self-hosted runners

Introduction to GitHub Actions Part 2: GitHub-hosted runners

Introduction to GitHub Actions Part 1: Getting Started

Understanding the FIFO Stock Inventory Problem

Dave Ballantyne’s Original Winning Solution

The Aunt Kathi Window Function Method

Conclusion

Article tags

Recommended

About the author

Kathi Kellenberger

Kathi's contributions

Articles

Books

Top topics

Kathi's latest contributions:

Introduction to GitHub Actions Part 3: Self-hosted runners

Introduction to GitHub Actions Part 2: GitHub-hosted runners

Introduction to GitHub Actions Part 1: Getting Started