Find 40 Problems in this SQL Server Stored Procedure

Comments 0

Share to social media

I’ve been at this for a while now, and have a very particular set of rules and coding conventions that I follow when writing and, more importantly, reviewing T-SQL code. I often perform code reviews and thought it would be fun to frame this exercise: a completely fictitious stored procedure hits my desk, I’ll reject it of course, and enumerate the reasons why. The procedure might pass review in a lot of shops, but not in mine.

We’re going to use the AdventureWorks sample database (get your copy here), where the folks in marketing requested a list of users to e-mail a new promotional campaign. The customers need to meet at least one of the following criteria:

  • last placed an order more than a year ago
  • placed 3 or more orders in the past year
  • have ordered from a specific category in the past two weeks

These criteria don’t have to make sense! They just need to make the query a little bit more complex than your average CRUD operations.

First, we’re going to update Sales.SalesOrderHeader to modern times so that dates make sense relative to today. We only care about OrderDate here, but there are check constraints that protect a couple of other columns (as well as a trigger that sometimes fails on db<>fiddle but that I have no energy to troubleshoot):

This stored procedure that someone wrote will now return data (without the update, it would be hard to write predictable queries based on, say, some offset from GETDATE()).

You laugh. Or maybe you cry. But I do see all of this, all the time, just maybe not always all at the same time.

While the stored procedure itself is only 40 lines, I’m sure a quick skim will reveal some things you don’t like. But do you think I can find 40 problems?

Challenge Accepted.

Before we start, I can’t emphasize enough that just about any convention in T-SQL is completely subjective, and you may or may not agree with any of mine. And that’s okay. Still, I’ll tell you the problems I spotted, line by line, and why I think they’re problems (with helpful links to more details, where appropriate).

Line 1

On the very first line, there are multiple problems:

  1. All lower case keywords – I’m only going to mention this once, but reading a wall of code in all lower case (or all upper case!) is hard on the eyes. Personally, I like upper casing T-SQL keywords, Pascal case for entity names, and lower case for data types (to be mentioned in a moment).
  2. Shorthand – The abbreviation proc is grating to me because people start to talk that way, too. “Can you create a prock?” It’s a procedure; just spell it out.
  3. Missing schema prefix – Leaving out the schema can be problematic both for performance and for other users and maintainers. While the performance issue is less relevant during object creation, it’s a mindset and we should all be consistent.
  4. Bad procedure name – Not just the sp_ prefix, which should be avoided but, also, what does “find_customers” mean? This name should be more specific to the task at hand, and also match your existing naming scheme. Something like Customer_GetListForCampaignXYZ, for example. (Generally, I prefer entity_action, like Customer_Find, over action_entity, like Find_Customer, primarily due to sorting and grouping. I’m very rarely looking for all the stored procedures that find anything.)
  5. Parentheses for nothing – It’s a minor gripe, but I don’t like adding extra control characters for the sake of it. Stored procedure parameter lists do not need to be enclosed in parentheses, so add them only if your team agrees that they actually make the procedure more readable.
  6. Bad parameter name (i) – It doesn’t need to start with @param – that is obvious from where it is and, later in the procedure, the source isn’t likely to be important anyway.
  7. Bad parameter name (ii) – It also doesn’t need to be prefixed with its data type – that is also obvious from the definition, and naming it this way locks it in (making future changes to that data type harder).
  8. Bad parameter name (iii) – In AdventureWorks, the IDs are upper case, so it should be @CategoryID just to be consistent and avoid any confusion or ambiguity.
  9. Upper case data type – I learned the hard way the best pattern for data types is to match the case in sys.types exactly, which is arguably more important if you ever deal with case sensitive databases or instances. While it doesn’t affect the basic native types, for consistency, this means you would use int, not INT.
  10. Everything on one line – For formatting purposes I kept the parameter list short, but I’ve seen lists of 10+ parameters scroll way off the screen to the right, and I don’t have to tell you what a headache that is during a fire drill. Strive to be more liberal with white space – the goal shouldn’t be to keep the line count short at the cost of readability.
Line 2

It’s only two characters, but it’s more about what’s missing here:

  1. No BEGIN/END – I prefer to wrap my procedure body in clear block elements that are even more explicit about where the procedure ends than indentation and the (non-T-SQL!) GO batch separator could ever be.
  2. No SET NOCOUNT – I always add SET NOCOUNT ON; to the beginning of every stored procedure to suppress all of those chatty DONE_IN_PROC messages. While in modern versions of SQL Server and various transport layers the benefits aren’t as pronounced as they used to be, the extra chatter still can’t possibly help anything.
Line 3

Only one new problem here:

  1. Lower case columns – In the metadata, those columns are named CustomerID, FirstName, and EmailAddress. Every single reference to them should match.
Line 5

  1. Single-line comments – While it’s actually refreshing to see comments inside of queries, not all comments are the same. As Brent Ozar explains in Never, Ever, Ever Start T-SQL Comments with Two Dashes, single-line comments cause many problems downstream – for example, when reviewing queries from the plan cache or monitoring tools, or even returning the definition to a grid in Management Studio. I’m far from perfect, especially in quick throwaway fiddles, but I’ve been trying harder to always use /* this form */. Contradicting Brent a little bit, Andy Mallon has a great productivity use case of combining the two forms in slide 28 of this presentation.
Line 7

  1. Unnecessary DISTINCT – This query is part of a UNION which – by default, and possibly not commonly known – has to perform a DISTINCT against the output of the concatenation anyway, so doing so inside any of the inputs to the UNION is redundant. This query is littered with these and they can all be avoided – typically they are used to suppress duplicate rows from joins that should have been implemented as EXISTS checks instead and/or deferring the joins until later.
Line 8

All right, now we’re getting warmed up! A few new problems here:

  1. Lower case object names – These tables are called Sales.Customer and Person.Person, and it’s important for code to match what is in the metadata since it may be destined for case sensitive environments. IntelliSense and other tools make missing this detail hard to excuse these days. If you have a case sensitive database, being cavalier about this can lead to problems, especially if production is case sensitive and your development environment is not. I’m a big advocate of always developing in case sensitive because surprises are easier to fix locally and up front than after they’ve been deployed.
  2. Bad aliases – Why is customer aliased as a and person as b? In longer and more complex queries, reverse engineering this bizarre (but common!) mapping pattern is a recipe for migraines. As an aside, I detest aliases that are added in this form without using the AS keyword – while optional, I think it gives an important visual cue.
  3. Littering of NOLOCK – This hint is not a magic go-faster switch; it is just bad news in general. Placing it next to every table reference throughout all queries is even worse. If you have some reason you can’t use READ COMMITTED SNAPSHOT (RCSI), at the very least set the isolation level once at the top of the procedure (and maybe even add a comment explaining why this is necessary). Then, when you come to your senses, it’s just one line to fix. (I collected a treasure trove of resources detailing the many perils of NOLOCK here.)
  4. Old-style joins – I still see these all the time, even though better syntax replaced them more than 30 years ago! FROM a, b is less desirable than explicit joins for a couple of reasons. One is that it forces you to group join criteria and filter predicates together (and an argument there that is only partially true is that it can invite unintended Cartesian products). The other is that it forces inconsistency because in SQL Server there is no longer a way to perform deterministic outer joins without switching to explicit OUTER JOIN syntax.
Line 9

  1. Missing alias – Why is there no alias reference associated with personid? When writing the initial query, this might not be a problem, because you’d get an ambiguous column name error if it were. But what about when someone comes back later and adds another table to the query? Now they have to go fix all unreferenced columns and possibly go look up all the tables to unravel. I talked about how we should always reference all columns for a recent T-SQL Tuesday.
Line 10

There’s definitely a code smell starting to waft from this line:

  1. Correlated subquery in a predicate – Running an aggregate against another table as part of a WHERE clause has the potential to scan that other table completely for every row in the current query (which could be a cross product for all we know).
  2. Subquery indentation – Any time you introduce an IN or EXISTS or other type of correlated anything, you should use white space to make it clear something radical is happening here – start it on the next line, make the left margin wider, do something to make it obvious and easier to follow.
Line 12

  1. DATEPART shorthand – I don’t even know if you could call this “shorthand,” as I certainly see no advantage to typing four letters (yyyy) instead of four different letters that form a self-documenting word (year). A shorthand habit here can lead to problems, e.g. play with date parts like W and Y. I talk about abusing shorthand and several other date-related topics in Dating Responsibly.
  2. Repeated non-deterministic variable – A minor thing but if getdate() is evaluated multiple times in the query, it could give a different answer, especially if the query is run right around midnight. Much better to pull these calculations out into a variable – not necessarily for better estimates, but to make sure all the predicates are speaking the same language.
Line 21

No new problems until we hit line 21:

  1. DATEPART shorthand – I’m mentioning this again not because of the shorthand itself but because it’s inconsistent. If you’re going to use yyyy then always use yyyy. Don’t get even lazier about it sometimes.
  2. BETWEEN {some time} and … {now}? – I’m already not a big fan of BETWEEN for date range queries (it can lead to lots of ambiguity), but I certainly see no reason to include the current time as the upper bound, rather than the whole clause just being >= {some time}. Orders generally aren’t placed in the future and, if they are, all the queries in this procedure should use that same upper bound. And if there is some reason the query should filter out future dates, there should be a comment here explaining why.
Line 27

  1. Inconsistent join ordering – in the previous section, the tables were listed person then emailaddress. Now the order is swapped, and maybe there’s a reason, but it causes the reader to pause and wonder what it is.
  2. Inconsistent aliasing – In the previous section of the query, person was aliased as p, and emailaddress as e. Now they’ve changed to pp and em – while they are technically different queries, it is much better for future maintainers to adopt a consistent aliasing strategy (including what to do when a single query is aliased multiple times in the same query), so that there is less reverse engineering every time a query is reviewed.
  3. Inconsistent hinting – All the other NOLOCK hints in the query are just added to the table alias. This one is done more correctly, with the proper WITH keyword, but it’s a variation that can be disruptive to a reviewer.
Line 28

  1. Inconsistent clause ordering – Getting nit-picky here perhaps, but in the previous section, the clauses were listed in the same order the tables were introduced to the join; this time, they’re swapped. If there’s a reason the order has changed, there should be a comment to help a future maintainer understand.
  2. Inconsistent spacing – I like white space around my keywords and operators, so it really stands out to me when there is a change in pattern – in this case, there is no helpful white space around the = operator.
Line 30

  1. TOP (2 billion) – This seems a holdover from the SQL Server 2000 days, where you could fool the engine into returning data from a view in a specific order without having to say so, simply by putting TOP 100 PERCENT ... ORDER BY into the view definition. That hasn’t worked in 20 years, but I see this technique sometimes used to try to defeat row goals or coerce a specific index. Most likely it was just copied and pasted from somewhere else, but it has no place here.
Line 33

  1. Expression against a column – SQL Server won’t be able to use an index on OrderDate because it will have to evaluate the expression for every row. This is a pattern I see quite a bit, and it can be hard to explain to newer users because, rightly so, they expect SQL Server to be able to rewrite the expression so that it can use an index. Always question when a WHERE or ON clause has to change what’s in the column in order to perform a comparison. In this case, much better (and more logical) to say WHERE oh.OrderDate >= @TwoWeeksAgo.
Line 35

  1. Inconsistent join syntax – This one wasn’t really a problem for me until it became inconsistent – don’t say join sometimes, and inner join other times. Personally, I prefer being more explicit about the type of join, but pick one or the other and always use that form.
  2. Another table alias issue – Suddenly we get an AS here, seemingly out of nowhere. I think this is another case where inconsistency is borne from queries being copied & pasted from other sources or multiple authors contributing to a query. It may seem trite but take the time to step through the query and make sure all of the conventions match your established style (and if you don’t have one, create one).
Line 37

  1. Filter as a join clause – Technically, that and should be a WHERE, since that is not a part of the join criteria. When we’re talking about OUTER joins, there is more nuance depending on which table the predicate applies to, but for an INNER join, the join and filter criteria shouldn’t mix.
  2. Misplaced ORDER BY – Merely mentioning this because when the TOP is removed this pointless clause will actually cause a compilation error.
Line 39

  1. Meaningless alias name – This alias should be more descriptive about the subquery it references. And trust me, I’m guilty of this too; you’ll see x as an alias or CTE name in many of my solutions on Stack Overflow (but never in production code). I never said some of the problems I’d point out here aren’t my own problems too!
Line 40

  1. ORDER BY ordinal – Using ordinal numbers to define sorting is convenient, sure, but it’s very brittle and not self-documenting. Anyone reading the code now has to go parse the SELECT list to work it out, and anyone modifying the SELECT list might not realize that adding or rearranging columns just introduced a bug in the application.
  2. No statement terminator – This is one of the things I’ve been harping on throughout a decent portion of my career – we should always be terminating statements with semi-colons. Though, personally, I don’t like adding them to block wrappers, like BEGIN/END, because to me those aren’t independent statements (and try adding a semi-colon to BEGIN/END TRY and BEGIN/END CATCH – one of these things is definitely not like the others.

See, That Wasn’t So Hard.

I looked at a 40-line stored procedure and identified 40 problems (not counting the ones that occurred multiple times). That’s a problem per line! And it wasn’t even that hard to come up with this “crazy” stored procedure – it’s actually quite believable that something similar could have crossed your desk as part of a pull request.

But How Would You Write It, Aaron?

Well, that’s a good question.

I would certainly start by stepping back to understand what the procedure is actually trying to accomplish. We have three separate queries that all do similar things: they join order and customer tables and include or exclude rows based on dates and counts and other criteria. Doing this independently and then combining them with multiple distinct sorts to remove duplicates is wasteful and could lead to unacceptable performance at scale.

So, I would attack the problem with the goal of touching each table only once – and of course cleaning up all the other things I noticed (most of which, admittedly, are aesthetic and have no bearing on performance). Taking a quick swing at it, and keeping in mind there are a dozen ways to write this, I came up with the following:

Total cost (in terms of line count): 10 additional lines (20%). Character count is actually slightly lower: 1,845 -> 1,793. But the improvement to readability, maintainability, and grokability: priceless.

What About Performance?

After confirming that they return the same results, I compared execution times and a few plan characteristics. Here are the raw metrics observed at runtime (click to enlarge):

Runtime metrics comparison

On my system, which has modern NVMe drives, the original query is a lot faster than I thought (even with a cold cache). But notice the much higher number of reads, which could be an issue on systems with slower or more contentious I/O, or not enough memory to always have the data in the buffer pool. Let’s look at the execution plans a little closer.

For the old query, here is the plan (click to enlarge):

Original, busy plan

And here is the new plan, which is far simpler (click to enlarge):

New, simpler plan

You can get a general sense for the differences by looking at the plans casually, but we can also take a quantitative comparison of the operators and other information the plans provide (with the worse figure highlighted in this salmon-y color):

Metric Original New Notes
Estimated Cost % 40.9% 59.1% Given everything else below, this demonstrates why estimated cost % can be misleading (a.k.a. garbage).
Estimated Rows
(Actual: 2,313)
25,700
1,111%
6,920
299%
Neither estimate was a bullseye, but the UNION query was off by more than an order of magnitude.
Reads 9,626 1,902 This can be a good warning for systems that are I/O-bound and/or memory-bound.
Clustered Index Scans
7 4 Scans against SalesOrderHeader: 3 vs. 1
Hash Matches 8 5 This can be a good warning for systems that are memory-bound (more info).
Sorts 5 1 And one sort in each plan (the final ORDER BY) isn’t necessary.
Spills 1 0 There was one level 2 spill in the original plan, nothing disastrous; likely a result of bad estimates.
Spools 0 1 The spool in the new plan is the only operator that is concerning – it only handles ~7K rows, but represents 13% of overall CPU.
Residual I/O Warnings 2 0 These are two bright yellow signs that the original query is reading a lot more data than it should.

Given all that, there’s very little redeeming about the original approach to union three separate queries – and I would still feel that way even without the unnecessary distinct sorts. The code review created a much more readable procedure and made the query perform better in nearly every measurable aspect. And it could potentially be tuned even further, even though a one-off lead generation procedure isn’t likely to be performance critical. The important thing is that making these conventions and processes commonplace will make it more likely that the next procedure won’t face such a rigorous review.

What If The Problems Aren’t Obvious?

As I stated at the beginning, I’ve developed a sense of code smells and best practices through experience over my entire career, and I’m still adjusting my techniques. So nobody expects you to be able to spot all of these problems out of the gate. One way to get a head start is to try out an add-in like SQL Prompt, which certainly did identify several of the above issues.

Wrapping Up

Now, would I torture a co-worker with a code review this thorough and pedantic? Of course not. But if the code raises my left eyebrow this many times, now I have a blog post I can point them at. 🙂