SQL Prompt Hidden Gem: Auto-fill the GROUP BY clause
Writing GROUP
BY
clauses is a very common, but tedious, activity for a SQL programmer. SQL Prompt will fill it in for you, with all non-aggregated columns returned by the query.
“Have you seen this new feature that auto-fills the GROUP BY with non-aggregated columns!” exclaimed my co-worker, soon after she had installed a new version of SSMS. I hadn’t, but I was intrigued. I must have written thousands of aggregate queries in my career, so it sounded like it could save me some time.
I searched around in SSMS, on the Virtual Machine I use for testing new versions of SQL Server, and it wasn’t there. Ah. It turned out that my testing VM didn’t have SQL Prompt installed, and this wasn’t a new SSMS feature, after all, but rather one of those ‘hidden’ SQL Prompt enhancements.
Filling in the GROUP BY manually
Let’s say I need to write a query on the WideWorldImporters database, to get the sum of items that have been ordered, grouped by postal code and the year when the customer’s account was opened.
In typical fashion, I start by filling in the tables and their join conditions. I type “or” to find the tables relating to orders, without even having to know that they live in the Sales
schema.
I expand the SELECT
list to return, initially, just the base rows that I need to work with. I apply my preferred Prompt formatting style, and the query looks something like this:
1 2 3 4 5 6 7 8 |
SELECT Customers.DeliveryPostalCode AS PostalCode, YEAR(Customers.AccountOpenedDate) AS AccountOpenYear, OrderLines.Quantity FROM Sales.OrderLines JOIN Sales.Orders ON Orders.OrderID = OrderLines.OrderID JOIN Sales.Customers ON Customers.CustomerID = Orders.CustomerID; |
The query returns a few hundred thousand rows, a small sample of which look like this:
1 2 3 4 5 6 |
PostalCode AccountOpenYear Quantity ---------- --------------- ----------- 90398 2013 7 90005 2013 7 90313 2013 7 90434 2013 7 |
I want to see total order quantity, by customer delivery postcode, and the year the customer opened their account, so the final SELECT
list looks like this:
1 2 3 |
SELECT Customers.DeliveryPostalCode AS PostalCode, YEAR(Customers.AccountOpenedDate) AS AccountOpenYear, SUM(OrderLines.Quantity) AS TotalOrderQuantity |
That just laves the GROUP BY
clause and typically, I would just copy and paste the non-aggregated columns in the SELECT
clause, and replace SELECT with GROUP BY:
1 2 |
GROUP BY Customers.DeliveryPostalCode AS PostalCode, YEAR(Customers.AccountOpenedDate) AS AccountOpenYear |
Finally, I just get rid of the aliases to give a final query like this:
1 2 3 4 5 6 7 8 9 10 |
SELECT Customers.DeliveryPostalCode AS PostalCode, YEAR(Customers.AccountOpenedDate) AS AccountOpenYear, OrderLines.Quantity FROM Sales.OrderLines JOIN Sales.Orders ON Orders.OrderID = OrderLines.OrderID JOIN Sales.Customers ON Customers.CustomerID = Orders.CustomerID GROUP BY Customers.DeliveryPostalCode, YEAR(Customers.AccountOpenedDate; |
All in all, it is a tedious process.
Auto-filling the GROUP BY with SQL Prompt
As with all clauses of a SQL statement, SQL Prompt offers assistance in filling out the GROUP BY
; I just hadn’t noticed it before! Just hit the space bar after the clause you type GROUP BY, shown in Figure 2.
Figure 2
Choose All non-aggregated columns, and it will fill in the GROUP BY clause with the following column expressions:
1 2 |
GROUP BY YEAR(Customers.AccountOpenedDate), Customers.DeliveryPostalCode; |
Alternatively, as you can also see in Figure 2, you can pick the non-aggregated columns individually. If you want to add additional groupings, such as by the month the customer opened the account, simply add MONTH(Customers.AccountOpenedDate) AS AccountOpenMonth
to the SELECT
clause, delete the existing GROUP BY
clause, and auto-refill the column expressions. The GROUP BY
clause will now look like this:
1 2 3 |
GROUP BY YEAR(Customers.AccountOpenedDate), MONTH(Customers.AccountOpenedDate), Customers.DeliveryPostalCode; |
In some cases, you may have values in your SELECT
clause that are in an aggregate function, but do not need to be in the GROUP BY
clause. For example, say we change the SELECT
query to include two new columns, highlighted in bold text in the following code. One is a literal value, and the other performs a calculation on a column that is already in the existing GROUP BY
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT Customers.DeliveryPostalCode AS PostalCode, YEAR(Customers.AccountOpenedDate) AS AccountOpenYear, MONTH(Customers.AccountOpenedDate) AS AccountOpenMonth, <strong>1 AS Value,</strong> <strong>YEAR(Customers.AccountOpenedDate) - 1 AS PreviousAccountOpenYear,</strong> SUM(OrderLines.Quantity) AS TotalOrderQuantity FROM Sales.OrderLines JOIN Sales.Orders ON Orders.OrderID = OrderLines.OrderID JOIN Sales.Customers ON Customers.CustomerID = Orders.CustomerID GROUP BY YEAR(Customers.AccountOpenedDate), MONTH(Customers.AccountOpenedDate), Customers.DeliveryPostalCode; |
This query will execute as is, because we don’t need to group by on a literal, nor do we need to group on an expression that only uses columns or expressions already in the GROUP BY
.
However, if we delete the existing GROUP BY
clause, and have SQL Prompt auto-fill it, then it will look slightly different:
1 2 3 4 |
GROUP BY YEAR(Customers.AccountOpenedDate), MONTH(Customers.AccountOpenedDate), <strong> YEAR(Customers.AccountOpenedDate) - 1,</strong> Customers.DeliveryPostalCode |
This is a rare case, and will almost certainly not cause any difference to the execution of the query, but be aware of it.
That’s our query finished, but if you want to order the output, try typing ORDER BY
and see what comes up!
Summary
Once you install SQL Prompt, you become aware immediately of how it enhances the native IntelliSense features of SSMS, but this can blind you to a lot of other features, hidden in various parts of the tool. Often, I only stumble across them by accident, or when alerted by a colleague. This is just another gem of a feature that I found myself using frequently once I realize it existed!
Writing GROUP BY
clauses is a very common, but tedious, activity for a SQL programmer. Replace that tedium with software, I always say.