Cleaning up common T-SQL coding issues with SQL Prompt
Some SQL coding habits are just annoying. Commas in front of column names? No way! Others are actively harmful; they’ll make your code error prone, harder for others to read and understand, and even harder to edit without making mistakes. This article covers five harmful problems that I see regularly in T-SQL code, and shows how you can use SQL Prompt to avoid or remove them.
Object owners
When you construct your T-SQL statements, it’s good practice to supply the owner of a given object. Every table reference should supply the schema to which the table belongs. Every column reference should supply the parent table. We do this for two reasons.
The first reason is clarity: it makes it much easier to read and understand your code when you know the owner of a column.
The second reason is to avoid compilation errors or, even worse, queries that run but return the wrong data. If you have one of those databases where every object belongs to the ‘dbo’ schema (I advise against it, as does Microsoft), then you can skip including the schema qualifier in your queries. However, if the database uses schemas, you’ll get compile errors if your queries don’t use schema qualifiers to reference the tables. Even worse, if tables and views have the same name in two different schemas, a common practice, you may get incorrect data from the query.
The query in Listing 1 runs without errors, because it uses schema qualifiers, which isn’t to say it’s free from problems.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
USE WideWorldImporters GO SELECT ExpectedDeliveryDate, OrderDate, LastReceiptDate, pol.LastEditedWhen, ValidFrom, ValidTo FROM Purchasing.PurchaseOrders AS po JOIN Purchasing.PurchaseOrderLines AS pol ON pol.PurchaseOrderID = po.PurchaseOrderID JOIN Application.People ON PersonID = po.ContactPersonID WHERE LastReceiptDate > '1/1/2016' |
Listing 1
In fact, this code is a bit of a mess. Two of the tables have an alias, but not Application.People
. Why? Because the query doesn’t reference any columns from that table that also appear in other tables. Likewise, an alias is used to identify the owner of the LastEditedWhen
column in the SELECT
and the PurchaseOrderID
columns in the join, but not any of the other columns. Again, these are the only columns that are common between tables, and so are the only ones the developer had to qualify.
This ‘lazy’ approach to object qualifiers can lead to all sorts of confusion. Which table contains OrderDate
or LastReceiptDate
? We have no way of knowing from this code. It’s reasonable to guess that PersonID
is in the Application.People
table, but T-SQL syntax would support referring to any other table as part of the JOIN
criteria, so we could easily have an issue there that can’t be resolved.
All of this makes the code hard to read, and hard to edit without error. SQL Prompt can help both avoid these problems in the first place, when writing new code, and remove them from existing code.
While writing code, SQL Prompt can supply, automatically, the object owners. You can even control this behavior in the Options, as shown in Figure 1.
If you prefer to use the full table name to aliases, you can.
For existing code, such as the query in Listing 1, SQL Prompt can apply object owners as part of code reformatting. I added a p
alias for the Application.People
table manually (SQL Prompt can do this for you too – see the next section) and then chose Format SQL from the SQL Prompt menu, or hit Ctrl-K, Ctrl-Y, to apply all formatting options. You can include object qualification as one of the Format SQL options by selecting it under Format | Actions in the SQL Prompt Options window. Alternatively, you can simply select the Qualify Object Names refactoring option in the SQL Prompt menu. The result is the code in Listing 2, which is much more readable and less likely to cause headaches.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
USE WideWorldImporters; GO SELECT po.ExpectedDeliveryDate, po.OrderDate, pol.LastReceiptDate, pol.LastEditedWhen, p.ValidFrom, p.ValidTo FROM Purchasing.PurchaseOrders AS po JOIN Purchasing.PurchaseOrderLines AS pol ON pol.PurchaseOrderID = po.PurchaseOrderID JOIN Application.People AS p ON p.PersonID = po.ContactPersonID WHERE pol.LastReceiptDate > '1/1/2016'; |
Listing 2
Aliases
As described previously, when writing T-SQL it’s important that you designate the owner of each object, and an alias allows you a ‘shorthand’ way to do that, rather than use the full table name each time. Of course, if you prefer the latter approach, it’s fine.
However, it is possible to misuse aliases in a way that leads to confusion. Listing 3 shows an example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
USE AdventureWorks2014 GO select a.Name,b.Name,c.Name from production.product as a join production.ProductSubcategory as b on a.productsubcategoryid = b.productsubcategoryid join production.productcategory as c on b.productcategoryid = c.productcategoryid where a.class = 'L' select a.Name,b.Name,c.Name from production.productsubcategory as a join production.Product as b on a.productsubcategoryid = b.productsubcategoryid join production.productcategory as c on a.productcategoryid = c.productcategoryid where a.productcategoryid = 4 select a.Name,b.Name,c.Name from production.productcategory as a join production.productsubcategory as b on b.productcategoryid = a.productcategoryid join production.product as c on b.productsubcategoryid = c.productsubcategoryid where a.name like 'C%' |
Listing 3
Here we have three different queries against the same tables. The developer has chosen to use meaningless aliases, simply using alphabetically ordered alias for each table, depending on the order it appears in the query. You may laugh, but I’ve seem similar practices in production code.
Each query is written slightly differently, changing the JOIN
order as well as the WHERE
clause. In the first query, Production.ProductSubcategory
is aliased as ‘b’. In the second, that table is ‘a’, and in the third, it’s back to ‘b’ again. Production.Product
is ‘a’, ‘b’, or ‘c’, depending on the query.
Confusion reigns, and these are just 3-table queries. It’s entirely possible, depending on your database even likely, to exceed 26 tables in a query and exceed 26 characters. Following this pattern, you get further confusion as you either add letters, ‘aa’, ‘ba’ or ‘bb’ or use numbers, ‘a1′,’b2’, and so on.
Do not go down this sorry path. Use sensible, meaningful aliases, based on the table name. In this example, the aliases ‘p’, ‘ps’, and ‘pc’ will be fine. SQL Prompt can even supply these aliases as you type the table names. You can control the behavior and even provide your own custom aliases as desired, as shown in Figure 2.
With this approach, you’ll see a much more consistent aliasing strategy, so that the code for one of these queries looks more like Listing 4.
1 2 3 4 5 6 7 8 9 10 11 |
USE AdventureWorks2014; GO SELECT p.Name, pc.Name, ps.Name FROM Production.Product AS p JOIN Production.ProductSubcategory AS ps ON ps.ProductSubcategoryID = p.ProductSubcategoryID JOIN Production.ProductCategory AS pc ON pc.ProductCategoryID = ps.ProductCategoryID WHERE p.Class = 'L'; |
Listing 4
Now, each time Product
is referenced, the alias will be ‘p’, ProductSubcategory
is ‘ps’, and ProductCategory
is ‘pc’. This will happen regardless of the order in which they’re written in the FROM clause. That makes the code more consistent and more readable, which avoids all the issues you see from the previous pattern.
Statement terminator
The ANSI standard has long included the semi-colon as a statement terminator. However, SQL Server does not require you use it. That doesn’t mean you don’t need it. A large number of statements within SQL Server require the preceding statement be terminated with a semi-colon. An incomplete listing of those commands is as follows:
WITH
Service Broker
commandsENABLE
/DISABLE
triggerMERGE
THROW
With each new version of SQL Server, more and more commands require the semi-colon as a terminator. In code where the use of statement terminators isn’t standard, we can run into problems. Consider the code in Listing 5, which uses a Common Table Expression (CTE).
1 2 3 4 5 6 7 8 9 10 11 |
;WITH x AS (SELECT bg.BuyingGroupName, c.CustomerID FROM Sales.BuyingGroups AS bg JOIN Sales.Customers AS c ON c.BuyingGroupID = bg.BuyingGroupID ) SELECT x.BuyingGroupName, COUNT(x.CustomerID) FROM x GROUP BY x.BuyingGroupName |
Listing 5
Notice the developer has tossed in a ‘random’ semi-colon before the WITH
. Why? Because if it’s not there and the previous statement required a terminator but it was missing, Listing 5 will generate an error.
The correct solution, of course, is not to add a semi-colon to the start whenever you use a CTE, but to future-proof your code by ensuring all statements are terminated. Once more, SQL Prompt will help out. You can configure it to automatically terminate all your statements with a semi-colon.
In fact, if you look back through the example code in this article, all the “before” code is missing the semi-colon and all the “after” code has it, thanks to SQL Prompt.
Single line queries
For some reason, some developers shy away from using the carriage return when writing T-SQL. As a result, you’ll see code that looks like that shown in Listing 6.
1 2 3 4 5 |
SELECT po.ExpectedDeliveryDate, po.OrderDate, pol.LastReceiptDate, pol.LastEditedWhen, p.ValidFrom, p.ValidTo FROM Purchasing.PurchaseOrders AS po JOIN Purchasing.PurchaseOrderLines AS pol ON pol.PurchaseOrderID = po.PurchaseOrderID JOIN Application.People AS p ON p.PersonID = po.ContactPersonID WHERE LastReceiptDate > '1/1/2016' |
Listing 6
That’s a single line in SSMS, which makes it extremely difficult to read. Further, it’s hard to spot issues with the code. For example, did you notice that LastReceiptDate
in the WHERE
clause doesn’t have an object owner?
This is a classic solution that SQL Prompt has supported for a very long time. In fact, the latest version, currently in beta, provides customizable formatting templates that allow far more granular control over exactly how you want your code to be laid out. From the SQL Prompt menu, select Edit formatting styles, then hit the Edit style button.
This allows direct control over how the lines are broken up, tabs, spaces, wraps, and more. It’s all focused on ensuring maximum code readability to make it possible for you to spot issues.
Brackets
Your databases may have spaces in the table names or use non-standard characters. This can be through choice, or because you have to use a third party tool. When this happens, any identifiers that contain the spaces, non-standard characters or keywords will need to have square brackets around them. One approach is to put the brackets around all identifiers, regardless of whether they need them or not.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
USE WideWorldImporters; GO SELECT [po].[ExpectedDeliveryDate], [po].[OrderDate], [pol].[LastReceiptDate], [pol].[LastEditedWhen], [p].[ValidFrom], [p].[ValidTo] FROM [Purchasing].[PurchaseOrders] AS po JOIN [Purchasing].[PurchaseOrderLines] AS pol ON [pol].[PurchaseOrderID] = [po].[PurchaseOrderID] JOIN [Application].[People] AS p ON [p].[PersonID] = [po].[ContactPersonID] WHERE [pol].[LastReceiptDate] > '1/1/2016'; |
Listing 7
Personally, I find that painful to look at, however, you may not. Whether you want to use brackets, or need to use brackets, SQL Prompt can help you out.
With the settings shown in Figure 5, SQL Prompt will add brackets to any identifiers that need them, due to use of spaces, reserved words, and so on. If you check the Enclose identifiers with square brackets [] box, it will add them to all identifiers.
If you simply want to rid existing code of brackets, you can choose the Remove Square Brackets refactoring option from the SQL Prompt menu, and it will leave in only those brackets that are required.
Conclusion
If your T-SQL code is riddled with formatting and layout problems that make the code difficult to read and hard to edit, without accidentally introducing errors, then SQL Prompt will clean it up for you very quickly. The ability to customize the behavior to fit your own local style will also assist you in ensuring your code is that much more readable and consistent.