{"id":1064,"date":"2011-01-11T00:00:00","date_gmt":"2011-01-11T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/showplan-operator-of-the-week-merge-interval\/"},"modified":"2021-08-16T15:02:09","modified_gmt":"2021-08-16T15:02:09","slug":"showplan-operator-of-the-week-merge-interval","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/learn\/showplan-operator-of-the-week-merge-interval\/","title":{"rendered":"Showplan Operator of the Week &#8211; Merge Interval"},"content":{"rendered":"<div id=\"pretty\">\n<p>&nbsp;<\/p>\n<div class=\"float-left\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-merge_1.gif\" alt=\"1214-merge_1.gif\" \/><br \/>\n Merge <br \/>\n Interval<\/div>\n<p>Hello dear readers. Once again here I am to talk with you about ShowPlan operators. Over the past few weeks and months, we&#8217;ve featured the ShowPlan operators that are used by SQL Server to build the query plan. If you&#8217;re just getting started with my Showplan series, you can find a list of all my articles <a href=\"http:\/\/www.simple-talk.com\/author\/fabiano-amorim\/\">here<\/a>.<\/p>\n<h1>Introduction<\/h1>\n<p class=\"MsoNormal\">In my last article I wrote about the Merge Join operator, and now I would like to continue with the subject &#8220;Merge&#8221; but now is time to feature another kind of merge, the Merge Interval operator.<\/p>\n<p class=\"MsoNormal\">Last week I was working, with a customer in Finland, to optimize some queries when I saw this operator in the execution plan. Because this is not very well documented, I&#8217;ll try to cover all <b><i>aspects<\/i><\/b> and <b><i>bugs<\/i><\/b> related to this operator (a.k.a. iterators).<\/p>\n<p class=\"MsoNormal\">In short, this is used to remove duplicated predicates in a query, and to find possible overlapping intervals in order to optimize these filters so as to avoid\u00a0 scanning the same data more than once.<\/p>\n<p class=\"MsoNormal\">As always, I completely understand that this is not as simple as I&#8217;ve just stated. You may have to read what I wrote more than three times to understand what I mean: Don&#8217;t worry about that, because I&#8217;ll go deep into this subject step by step so as to make it easier for you to understand.<\/p>\n<h1>Creating sample data<\/h1>\n<p class=\"MsoNormal\">To illustrate the Merge Interval behaviour, I&#8217;ll start by creating one table called &#8220;Pedidos&#8221; (which means &#8216;Orders&#8217; in Portuguese). The following script will create the tables and populate them with some garbage data:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE tempdb\r\nGO\r\n\u00a0\r\nIF OBJECT_ID('Pedidos') IS NOT NULL\r\n\u00a0 DROP TABLE Pedidos\r\nGO\r\n\u00a0\r\nCREATE TABLE Pedidos (ID INT IDENTITY(1,1) PRIMARY KEY,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ID_Cliente INT NOT NULL,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Quantidade SmallInt NOT NULL,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Valor Numeric(18,2) NOT NULL,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Data DATETIME NOT NULL)\r\nGO\r\n\u00a0\r\nDECLARE @I SmallInt\r\nSET @I = 0\r\n\u00a0\r\nWHILE @I &lt; 10000\r\nBEGIN\r\n\u00a0 INSERT INTO Pedidos(ID_Cliente, Quantidade, Valor, Data)\r\n\u00a0\u00a0\u00a0 SELECT ABS(CheckSUM(NEWID()) \/ 100000000),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ABS(CheckSUM(NEWID()) \/ 10000000),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ABS(CONVERT(Numeric(18,2), (CheckSUM(NEWID()) \/ 1000000.5))),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 GETDATE() - (CheckSUM(NEWID()) \/ 1000000)\r\n\u00a0 SET @I = @I + 1\r\nEND\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">Now that we have the table, we have to create two non-clustered indexes. The first uses the column <b>ID_Cliente <\/b>as a Key, including the column <b>Valor<\/b> to create a covered index to our query. And another using the column <b>Data <\/b>as a Key and including the column <b>Valor<\/b>.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE NONCLUSTERED INDEX ix_ID_Cliente ON Pedidos(ID_Cliente) INCLUDE (Valor)\r\nGO\r\nCREATE NONCLUSTERED INDEX ix_Data ON Pedidos(Data) INCLUDE (Valor)\r\nGO\r\n<\/pre>\n<h1>Merge Interval<\/h1>\n<p class=\"MsoNormal\">Now that we have the data, we can write a query to see the merge interval. The following query is selecting the amount of sales for four customers:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT SUM(Valor) AS Val\r\n\u00a0 FROM Pedidos\r\n\u00a0WHERE ID_Cliente IN (1,2,3,4)\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">For the query above we&#8217;ve the following execution plan:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image002-630x303.jpg\" alt=\"1214-image002-630x303.jpg\" width=\"630\" height=\"303\" \/><\/p>\n<p class=\"caption\">Figure 1 &#8211; Execution Plan (click on this to see it full-size)<\/p>\n<p class=\"MsoNormal\">In the execution plan above we can see that QO chose to use the index <b>ix_ID_Cliente <\/b>to seek the data for each <b>ID_Cliente<\/b> specified in the IN clause, and then uses the <a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/showplan-operator-of-the-week---stream-aggregate\/\">Stream Aggregate<\/a> to perform the sum<b>.<\/b><\/p>\n<p class=\"MsoNormal\">This is a classic Index Seek task, for each value SQL Server will read the data throw the balanced index tree searching for the <b>ID_Cliente<\/b>. For now, It doesn&#8217;t require the Merge Interval.<\/p>\n<p class=\"MsoNormal\">Now let&#8217;s looks at a similar query:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DECLARE @v1 Int = 1, \r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v2 Int = 2,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v3 Int = 3,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v4 Int = 4\r\n\u00a0\r\nSELECT SUM(Valor) AS Val\r\n\u00a0 FROM Pedidos\r\n\u00a0WHERE ID_Cliente IN (@v1, @v2, @v3, @v4)\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">For the query above we&#8217;ve the following execution plan:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image005.jpg\" alt=\"1214-image005.jpg\" width=\"624\" height=\"223\" \/><\/p>\n<p class=\"caption\">Figure 2 &#8211; Execution Plan \u00a0(click on this to see it full-size)<\/p>\n<p class=\"MsoNormal\">As you can see, the only difference between the queries is that now we are using variables instead of constant values, but the Query Optimizer creates a very different execution plan for this query. So the question is: What do you think? Do you think that SQL should have used the same execution plan for this query?<\/p>\n<p class=\"MsoNormal\">The right answer is No. Why not? Because at the compile time SQL Server doesn&#8217;t know the values of the constants, and if the values turn out to be duplicates, then it will read the same data twice. Suppose that the value of the <b>@v2<\/b> is also &#8220;1&#8221;, SQL will read the ID 1 twice, one for variable <b>@v1<\/b> and another for variable <b>@v2<\/b>, something that we don&#8217;t expect to see since we expect performance, read the same data twice is not good. So it has to uses the Merge Interval to remove the duplicate occurrences.<\/p>\n<p class=\"MsoNormal\">Let&#8217;s wait a minute Fabiano! Are you saying that for the first query, QO automatically removes the duplicated occurrences in the IN clause?<\/p>\n<p class=\"MsoNormal\">Yes. Do want to see it?<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT SUM(Valor) AS Val\r\n\u00a0 FROM Pedidos\r\n\u00a0WHERE ID_Cliente IN (1,1,3,4)\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">For the query above we&#8217;ve the following execution plan:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image006-630x230.jpg\" alt=\"1214-image006-630x230.jpg\" width=\"630\" height=\"230\" \/><\/p>\n<p class=\"caption\">Figure 3 &#8211; Execution Plan \u00a0(click on this to see it full-size)<\/p>\n<p class=\"MsoNormal\">You will see that now we only have three Seek Predicates. Perfect.<\/p>\n<p class=\"MsoNormal\">Let&#8217;s back to Merge Interval plan.<\/p>\n<p class=\"MsoNormal\">The plan is using the operators <a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/showplan-operator-of-the-week---compute-scalar\/\">Compute Scalar<\/a>, <a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/showplan-operator-of-the-week---concatenation\/\">Concatenation<\/a>, <a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/showplan-operator-of-the-week---sort\/\">Sort<\/a> and Merge Interval to eliminate the duplicated values at the execution plan phase.<\/p>\n<p class=\"MsoNormal\">At this time, maybe some questions are rising in your mind. First: Why SQL Server don&#8217;t just uses a DISTINCT in the IN variables to remove the joins? Second: Why this is called a &#8220;Merge&#8221;, I didn&#8217;t see anything related to a merge here.<\/p>\n<p class=\"MsoNormal\">The answer is that the Query Optimizer (QO) uses this operator to perform the DISTINCT because, with this code, the QO also recognize overlapping intervals and will potentially merge these to non-overlapping intervals that will then be used to seek the values. To understand this better let&#8217;s suppose that we have the following query that doesn&#8217;t use variables.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT SUM(Valor) AS Val\r\n\u00a0 FROM Pedidos\r\n\u00a0WHERE ID_Cliente BETWEEN 10 AND 25\r\n\u00a0\u00a0\u00a0 OR ID_Cliente BETWEEN 20 AND 30\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">Now, let&#8217;s look at the execution plan:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image008-630x239.jpg\" alt=\"1214-image008-630x239.jpg\" width=\"630\" height=\"239\" \/><\/p>\n<p class=\"caption\">Figure 4 &#8211; Execution Plan \u00a0(click on this to see it full-size)<\/p>\n<p class=\"MsoNormal\">Notice how smart the Query Optimizer was. (That&#8217;s is why I love it!) It recognizes the overlap between the predicates, and instead of doing two seeks in the index (one for each between filter), it creates a plan that performs just one seek.<\/p>\n<p class=\"MsoNormal\">Now let&#8217;s change the query to use the variables.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DECLARE @v_a1 Int = 10,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v_b1 Int = 20,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v_a2 Int = 25,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 @v_b2 Int = 30\r\n\u00a0\r\nSELECT SUM(Valor) AS Val\r\n\u00a0 FROM Pedidos\r\n\u00a0WHERE ID_Cliente BETWEEN @v_a1 AND @v_a2\r\n\u00a0\u00a0\u00a0 OR ID_Cliente BETWEEN @v_b1 AND @v_b2\r\nGO\r\n<\/pre>\n<p class=\"MsoNormal\">For this \u00a0query \u00a0we&#8217;ve the following execution plan:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-ExecutionPlan.jpg\" alt=\"1214-ExecutionPlan.jpg\" width=\"630\" height=\"160\" \/><\/p>\n<p class=\"caption\">Figure 5 &#8211; Execution Plan \u00a0(click on this to see it full-size)<\/p>\n<p class=\"MsoNormal\">Let&#8217;s check what the plan is doing using a different perspective. First let&#8217;s understand the overlap.<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image013.jpg\" alt=\"1214-image013.jpg\" width=\"622\" height=\"135\" \/><\/p>\n<p class=\"caption\">Figure 6 &#8211; Overlap between 20 and 25<\/p>\n<p class=\"MsoNormal\">In the figure 6 we can see that if SQL Server reads the ranges separately, it will read the range from 20 to 25 twice. I&#8217;ve used a small range to test with, but think in terms of a very large scan that we&#8217;d see in a production database; \u00a0if we can avoid this step, then we&#8217;ll see a great performance improvement.<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1214-image015.jpg\" alt=\"1214-image015.jpg\" width=\"624\" height=\"136\" \/><\/p>\n<p class=\"caption\">Figure 7 &#8211; After Merge Interval<\/p>\n<p class=\"MsoNormal\">After the Merge Interval runs, SQL Server can seek only the final range. It knows that is possible to go to <b>@v_a1<\/b> to <b>@vb_2<\/b> directly.<\/p>\n<h1>Finally<\/h1>\n<p class=\"MsoNormal\">To finish this subject I would like to recommend you to read about a bug in SQL Server 2005 caused by a mistake in this process, you could take a better look in the blog of \u00a0<a href=\"http:\/\/weblogs.sqlteam.com\/mladenp\/archive\/2008\/07\/14\/SQL-Server-2005-bug-when-using-LIKE-searches.aspx\">Mladen Prajdic<\/a> a SQL Server MVP from Slovenia.<\/p>\n<p class=\"MsoNormal\">I wouldn&#8217;t miss the opportunity to congratulate the Microsoft guys that build Icons in SQL Server\/Windows. I once read a book called &#8220;The Icon Book&#8221;, it was amazing how beautiful and meaningful the icons in the graphical query plan \u00a0are. The Merge Interval icon is perfect, if you look at the icon you will see what it is exactly doing. Brilliant, it&#8217;s incredible how they can express something in a small picture. Well Done!<\/p>\n<p class=\"MsoNormal\">That&#8217;s all folks, I hope you&#8217;ve enjoyed learning about Merge Join operator, and I&#8217;ll see you soon with more &#8220;Showplan Operators&#8221;.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>When Fabiano agreed to undertake the epic task of describing each showplan operator, none of us quite predicted the interesting ways that the series helps to understand how the query optimiser works. With the Merge Interval, Fabiano comes up with some insights about the way that the Query optimiser handles overlapping ranges efficiently&hellip;<\/p>\n","protected":false},"author":65554,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143525],"tags":[4178,4149,5335,5084,5162,5192,4150],"coauthors":[6809],"class_list":["post-1064","post","type-post","status-publish","format-standard","hentry","category-learn","tag-bi","tag-learn-sql-server","tag-merge-interval","tag-optimiser","tag-showplan","tag-showplan-operators","tag-sql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1064","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/65554"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1064"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1064\/revisions"}],"predecessor-version":[{"id":74793,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1064\/revisions\/74793"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1064"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1064"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1064"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1064"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}