Product articles SQL Prompt SQL Code Analysis
SQL Prompt Code Analysis: INSERT INTO…

13 August 2018

1 Comment

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

13 August 2018

1 Comment

SQL Prompt Code Analysis: INSERT INTO a permanent table with ORDER BY (PE020)

Data in relational tables is never guaranteed to be returned in a particular order, so adding an ORDER BY to an INSERT INTO statement is meaningless and, pre-SQL Server 2012, could cause performance issues. If you really need to impose a particular order on rows in the table, use a Row_Number() window clause instead.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

The SQL query that is used to produce the result that is inserted into the permanent table has its order specified by an ORDER BY statement. Relational tables are not ordered, so the ORDER BY is meaningless. Use a Row_Number() window clause instead, if you need to impose a particular order on rows in the table, via a query, especially if you need to allow subsequent insertions into the table. For temporary tables or table variables that will have no subsequent insertions, the current order can be maintained via an IDENTITY field in the target table, but it is an anachronism now that there are better and more versatile ways of doing it.

SQL Prompt will raise a violation of a performance rule (PE020) when it detects use of INSERT INTO…ORDER BY, when inserting into a permanent table.

SQL Prompt detects code smell PE020 - a query that inserts into a permanent table using an ORDER BY.

It does so to catch out a mistake that is easy to make, i.e. leaving in an ORDER BY clause that is no longer needed. Before SQL Server 2012, these ORDER BY clauses could be quite costly in terms of elapsed time and resources, but now they are generally ignored by the optimizer.

Meaningless ORDER BYs

Imagine that we have the query in Listing 1.

Listing 1

The results are as follows:

OK. We like this, and the order it gives, so we attempt to insert the data into our work table, preserving the same order.

Listing 2

Now we select from the table using:

Listing 3

Of course, the contents of a table have no natural order, and so the use of ORDER BY in Listing 2 is meaningless. If we had created a clustered index, on the FullName column, to turn OurPeople1 into a table rather than a heap, then Listing 3, then we’d get a different order, the order of the clustered index.

In this example, the ORDER BY clause in Listing 2 has been entirely ignored by SQL Server. The only way to guarantee to return data in a specific order, from our dbo.OurPeople1 table is to specify that order. If you want it ordered by Person.LastName, followed by Person.FirstName then first include the sort terms (LastName and FirstName) as columns in the work table and then do…

Listing 4

You could add the PRIMARY KEY field of the original data, i.e. add the BusinessEntity_ID of the Person.Person table, do a join and then order by those missing fields, as demonstrated in Listing 5.

Listing 5

Both these solutions preserve the order, whatever subsequent changes you make to the data.

The IDENTITY Trick

However, there is sometimes a reason why you can’t reference the original data by which you did the ordering, and you find that you need to specify an order. It is possible to specify the order with a surrogate field of incrementing numbers, but you’ll experience the downside to this solution if you add or amend the data so as to affect the sort order.

There was a time, before the introduction of the Row_Number()window function, that the only way that you could provide the order of the data in a work table was via the ‘IDENTITY trick’. You insert into a temporary table, to increment the IDENTITY field in the order provided by the ORDER BY clause.

Listing 6

The results are:

That’s a lot better, but why put the IDENTITY field in a table variable? Why not just create it in the destination table? The problem is in doing insertions. The IDENTITY field is immutable, so without this intermediate stage, you wouldn’t be able to change the order, or to easily do any subsequent insertions other than to the beginning or end of your order.

Using the Row_Number() window function

Nowadays, we don’t need any of that extra work: we don’t use an ORDER BY clause at all, and we can subsequently change the order if we need to. We just use the Row_Number()window function.

Listing 7

Summary

Of all the code I’ve shown here, only Listing 2 triggers the PE020 warning. It only happens if you insert into a permanent table using a SELECT statement with an ORDER BY clause. It is simply advising you of an unnecessary ORDER BY clause. If you are deliberately seeking to preserve a certain order, on insertion, then use the Row_Number() window function in that SELECT query, within the INSERT INTO statement, instead of using the ORDER BY statement.

Share this post.

Share on FacebookShare on Google+Share on LinkedInTweet about this on Twitter
  • Chad Baldwin

    Ran into this today….But it’s on a permanent table with an IDENTITY column that is the primary key (clustered).

    It’s a table that tracks usage, I am randomly generating some sample data, so I want the records to be inserted in a certain order. So I simply used an ORDER BY and it works great.

    So is what I’m doing correct? Or is this frowned upon? Cause it doesn’t seem to go along with the examples above, but I’m still getting the PE020 error….

You may also like

  • Article

    How to Test SQL Server Functions and Procedures using SQL Prompt

    Phil Factor shows how to use the "Script As Insert" feature in SQL Prompt to refactor a table of "input and expected output" values into a simple test harness for stored procedures and functions.

  • Community event

    Raleigh Code Camp

    Redgate will be remotely sponsoring the Raleigh Code Camp 2019. Code Camp is a place for local software developers to come and learn from their peers. This community-driven event has become an international trend where peer groups of all platforms, programming languages and disciplines band together to bring content to the community.

  • Article

    SQL Prompt Code Analysis: A Hint is Used (PE004-7)

    Phil Factor suggests a philosophy of "the SQL query optimizer knows best" when it comes to choosing the right execution plan. Use hints as a last resort, and evaluate them carefully whenever SQL Prompt warns you of their presence in your SQL code.

  • Article

    Q&A from our recent ‘What’s New in SQL Toolbelt’ webinar

    I recently gave a Redgate University webinar with Chris Kerswell and Thomas Walsh. Giving webinars as a team is terrific, as it shows how we work at Redgate. Chris is a Sales Engineer, Tom is a Software Engineer, and I’m a DevOps Advocate. Behind the scenes, we often work together across these groups to solve