Alternatives To SQL Server 2022 Built-in function GREATEST

Comments 0

Share to social media

If you haven’t already heard, SQL Server 2022  introduced a new built-in system function called GREATEST. Simply put, it is to a set of columns, variables, expressions etc. what the MAX function is to a set of values (i.e., rows) of a single column or expression. The opposite of GREATEST function is LEAST function which returns the smallest value from the supplied list of columns, variables, constants etc. GREATEST & LEAST can be considered a pair, just as MIN/MAX functions are.

So, on a SQL Server 2022 instance, you can simply call GREATEST function with list of columns, variables, or constant values to get the highest value out of all of them. Here is an example with list of constants:

This will return:

A screenshot of a computer

Description automatically generated

In the same manner,

I can also pass a list of columns to it:

This returns the following:

And while it is not as easy to read, note that you can pass in expressions as well, as you can get the same result by pushing the GREATEST function up into the primary query too:

Isn’t that a nifty feature?

This rest of this article, focuses on how to achieve the same functionality in pre-SQL Server 2022 versions of SQL Server. Same approaches can be used to mimic the functionality of LEAST function as well.

I recently had to remind a client of mine that we are past the halfway mark of year 2023 that they still didn’t have a single SQL Server 2022 server, even though their SQL Server licenses covered it. I would not be surprised if there are many organizations that may still have a sizable number of SQL Servers pre-SQL 2022.

I personally though, am looking forward to when the next release of SQL Server will come out, whether it would be in 2024, 2025, or 2026? Hopefully Microsoft doesn’t make us wait beyond the year 2026.

In this article I am going to demonstrate 2 viable alternative solutions to find the largest value of a group of values in your queries (not including writing a very complicated CASE expression!)

Table Value Constructor

To start with, I will discuss using a Table Value Constructor (TVC) as I think it is easier to write and understand. TVC is very similar to the VALUES clause that you have used with the INSERT INTO statements to add new data into a table. You are maybe already aware that you can insert multiple rows in a single INSERT INTO statement:

This returns:

id          name
----------- ----------

1           tom
2           jerry

3           brian

Similarly, you can use a VALUES clause to construct a derived table. Here is an example of using it with constants:

This will return the greatest of values in that derived table:

A screenshot of a computer

Description automatically generated

Similarly, you can use list of columns, variables, expressions etc. 

So, using the VALUES construct here, I will construct a derived table to convert values from multiple columns into values/rows for a single derived column, last_access_time, sort its results in descending order (from highest to lowest) then select the first row’s value to get the highest (i.e. GREATEST) value.

This will return something like the following (depending on your actual server’s utilization):

Note: depending on the kind of server you are using this on, it may have NULL values for some of the values. For example, on the editor’s computer, it returned:

So, just like MAX, it does not return NULL unless all scalar expressions are NULL. This method is by no means as easy to work with as the GREATEST function, but it is generally straightforward to implement if needed.

It looks complicated without the built-in GREATEST function, doesn’t it? Well, I think the next method, UNPIVOT, I will show you is even more complicated. That’s just my opinion, the opinions of others in general may vary.

Unpivot

The idea with using an unpivot is to take values from multiple columns and turns them into rows for a single column. In other words, it can be used to convert a set of columns into a row. If you are familiar with the TRANSPOSE function in excel, unpivot is very similar. It’s the opposite of the PIVOT statement, which takes values for a single column from multiple rows and turns each value into multiple columns.

Here is an example of how you can use UNPIVOT to convert columns into rows (note that if you have not executed any queries in the other databases since a reboot, no data may be returned by this query):

The values for the four columns last_user_lookup, last_user_scan, last_user_seek and last_user_update in the sys.dm_db_index_usage_stats are now showing as rows.

And I just want to know the highest value from the 4 columns, I can just use the MAX function and slightly modify the query to add the GROUP BY clause.

This returns the following on my server.

To extend the example further, I want to know if/when the last time a table in the the database was accessed. This query is longer than it may need to be because I also want to display the individual values for the 4 columns as well as the MAX/highest value among them.

This returns:

As you can see, the using UNPIVOT can take more lines of code and can be quite a bit complicated. As you will see later in this article, per the performance test I performed, it is also slower than the Table Value Constructor method.

I have tested these syntaxes in SQL Server versions going back to 2012. PIVOT and UNPIVOT were first introduced in SQL 2005 so the syntax’s I presented in this article should still work there.

Performance Test

How is the query performance between the TVF and Unpivot methods, or with the new greatest function?

To find that out, I am going to generate some random data. The following SQL script creates a table with name random_data_for_testing and inserts 10 million rows into it.

My first test query is to do a test to get the highest value among the 10 columns (named value_01 to value_10 in the test table), using all 3 methods. I ran the script multiple times to make sure there are no physical reads or read-ahead reads with any of the queries so we can compare just the duration, logical IO and CPU as accurately as possible.

Results:

Here are the key stats from the STATISTICS IO and STATISTICS TIME:

********* Test using the function Greatest...
Table 'random_data_for_testing'. Scan count 5, logical reads 118295
SQL Server Execution Times:
CPU time = 4405 ms, elapsed time = 1381 ms.

********* Test using the TVC method..........
Table 'random_data_for_testing'. Scan count 5, logical reads 117343
SQL Server Execution Times:
CPU time = 12000 ms, elapsed time = 3332 ms.

********* Test using the Unpivot...
Table 'random_data_for_testing'Scan count 5, logical reads 117179
SQL Server Execution Times:
CPU time = 17219 ms, elapsed time = 4813 ms.

The amount of IO (logical reads) is almost identical among the three methods. But notice the difference in the CPU time and Elapsed time values for each. The elapsed time for the function GREATEST is 1381 ms, for the TVC it is 3332 ms and for UNPIVOT it is 4813 ms.

Below is the screenshot of the actual execution plan and as you can see the query cost as relative to the batch, cost of function GREATEST is 12%, TVC is 34% and the UNPIVOT is 54%.

As you can see from the thickness of the pipe going from right to left, when a query needs to process a relatively great deal of rows, using the build in function gives the best performance. That shouldn’t be a surprise. If that wasn’t the case, somebody at Microsoft made a big booboo! However, between the TVF and Unpivot, TVF is faster.

Note: This by no means a scientific test. Generally, though I have seen TVF to be faster than Unpivot.

What if the query is using a very selective filter condition on a clustered primary key on a single, integer column? This kind of queries are usually ideal queries in “almost” any conditions, even if it includes columns with LOB data types like XML, VARBINARY(MAX) etc.
Just to be sure, let’s look at the following example:

This returns three identical result sets:

And quite similar execution times as well:

********* Test using the function Greatest...
Table 'random_data_for_testing'. Scan count 0, logical reads 3
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

********* Test using the TVC method..........
Table 'random_data_for_testing'. Scan count 0, logical reads 3
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

********* Test using the Unpivot...
Table 'random_data_for_testing'. Scan count 0, logical reads 6
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 5 ms.

With the following actual execution plans:

The entire script finishes in sub-sub-second. Still, at a microsecond (a millionth of a second) level, you can see that the Unpivot performs the worst.

Summary

In this article, I have demonstrated the new GREATEST and LEAST value functions in SQL Server 2022 and have show that in general testing, these functions are very fast. However, as not everyone is using SQL Server 2022, I also demonstrated a method using Table Value Constructors that give you similar functionality (with much less straightforward code), as well as a method using the often disregarded UNPIVOT function.

 

About the author

Drupal Grupal

See Profile

Drupal is a SQL Server consultant with over 20 years of experience in engineering innovative SQL Server solutions for high growth organizations. A truly SQL Server specialist and IT generalist, Drupal is a certified MCDBA, Oracle OCP-DBA and IBM WebSphere Administrator. He is also ITIL V3 certified at the Foundation and Intermediate levels as well as PMP.

Drupal's contributions
Drupal's latest contributions: