{"id":73188,"date":"2014-08-29T16:40:25","date_gmt":"2014-08-29T16:40:25","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/execution-plans-part-9-multiplication\/"},"modified":"2021-07-14T13:07:30","modified_gmt":"2021-07-14T13:07:30","slug":"execution-plans-part-9-multiplication","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/execution-plans-part-9-multiplication\/","title":{"rendered":"Execution Plans Part 9: Multiplication"},"content":{"rendered":"<p>In <a title=\"Execution Plans Part 8: Cost, time, etc.\" href=\"https:\/\/allthingsoracle.com\/execution-plans-part-8-cost-time-etc\/\">part 8<\/a> we looked at a very simple execution plan involving a hash join; but that example was too simple to give us the full flavour of the arithmetic involved in Oracle\u2019s predictions because every operation executed just once. We need to see some plans where each execution of a parent operation could requires multiple executions of its child operations; and that\u2019s what we\u2019re going to do in parts 9 and 10.<\/p>\n<h2>Getting started<\/h2>\n<p>Thanks to the constant evolution of cost based query transformation (CBQT) it\u2019s surprisingly hard to find an example of a simple execution plan that actually displays the information that it \u201creally ought to\u201d, so I\u2019ll have to show you a couple of examples where the way the numbers are displayed is inappropriate, and then explain the history behind the anomalies. In this article we\u2019ll look at a simple nested loop join (using 11.2.0.4):<\/p>\n<pre>select\r\n\tt1.id, t2.id\r\nfrom\r\n\tt2, t1\r\nwhere\r\n\tt2.n1 = 15\r\nand\tt1.n1 = t2.n2\r\n;\r\n\r\n---------------------------------------------------------------------------------------\r\n| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT              |       |   225 |  4500 |    46   (0)| 00:00:01 |\r\n|   1 |  NESTED LOOPS                 |       |   225 |  4500 |    46   (0)| 00:00:01 |\r\n|   2 |   NESTED LOOPS                |       |   225 |  4500 |    46   (0)| 00:00:01 |\r\n|   3 |    TABLE ACCESS BY INDEX ROWID| T2    |    15 |   180 |    16   (0)| 00:00:01 |\r\n|*  4 |     INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |\r\n|*  5 |    INDEX RANGE SCAN           | T1_I1 |    15 |       |     1   (0)| 00:00:01 |\r\n|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |    15 |   120 |     2   (0)| 00:00:01 |\r\n---------------------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n   4 - access(\"T2\".\"N1\"=15)\r\n   5 - access(\"T1\".\"N1\"=\"T2\".\"N2\")<\/pre>\n<p>As you can see, this plan has taken the new <em>\u2018nlj_batching\u2019<\/em> strategy which shows up in the doubling of the nested loop operation. The order of operation is 4, 3, 5, 2, 6, 1, 0: We do an index range scan of <em><strong>t2_i1<\/strong><\/em> to identify rows from <em><strong>t2<\/strong><\/em>, for each row we acquire from <em><strong>t2<\/strong><\/em> we do an index range scan of <em><strong>t1_i1<\/strong><\/em>, then for each rowid we fetch from <em><strong>t1_i1<\/strong><\/em> we visit <em><strong>t1<\/strong><\/em> to get the row. Essentially the predicted row counts are correct \u2013 my data set will supply 15 rows from <em><strong>t2<\/strong><\/em>, and for my data set each row in <em><strong>t2<\/strong><\/em>\u00a0will join to 15 row in\u00a0<em><strong>t1<\/strong><\/em> for a total result set of <em><strong>225<\/strong><\/em> rows. But what do the figures actually tell us ?<\/p>\n<p>I\u2019m going to ignore the Bytes column \u2013 we learned all we need to know about what it\u2019s trying to say, and how badly it says it, in the last installment. I\u2019m also going to ignore the Time column since we know that it is derived from the Cost column multiplied by the <em><strong>sreadtim<\/strong><\/em> from <em><strong>sys.aux_stats$<\/strong><\/em>.<\/p>\n<p>Lines 4 and 3 tell use we pick up 15 rowids then pick up 15 rows; the costs are 1 and 16 respectively \u2013 one leaf block physical read then (remembering that we accumulate up the plan from child to parent) 15 table block reads to visit the table plus the one leaf block read for a total cost of 16.<\/p>\n<p>Line 5 says we do an index range scan of <em><strong>t1_i1<\/strong><\/em> at a cost of 1 to pick up 15 rows (and yes, each time we do it for my data set we do visit just one leaf block and find 15 rowids). Those figures, of course, are the prediction for one execution of line 5 &#8211; but when the query runs the optimizer isn&#8217;t expecting line 5 to be executed just once.<\/p>\n<h2>Hitting Problems<\/h2>\n<p>We (and the optimizer) expect to do that index range scan 15 times in total and that\u2019s why the nested loop join at line 2 tells us (correctly) that it will generate 225 rows, but where does the cost of 46 come from? The answer is simple: it\u2019s a bug, more or less, that reflects history. The cost at line 3 <strong>ought<\/strong> to say 31 because (accumulating from the children up \u2013 and remembering to allow for multiple child executions) it\u2019s: 16 (one execution of line 3) + 1 (one execution of line 5) * 15 (because we expect 15 executions thanks to the 15 rows predicted in line 3).<\/p>\n<p>Suspend the question <em>\u201cwhy 46 instead of 31?\u201d<\/em> for a moment and go on to line 6 \u2013 which is completely wrong. For each of the 225 items generated in line 2 we will execute line 6 to find one row in table <em><strong>t1<\/strong><\/em> (not the fifteen rows predicted); and an appropriate cost of a single execution of table access by rowid is one (not the two predicted).<\/p>\n<p>There&#8217;s a reason why the figures at this point are inappropriate: they were designed to reflect the execution strategy that the optimizer used to produce in Oracle 8.1 and earlier, which would have looked like this (using <em><strong>utlxpls.sql<\/strong><\/em> on a very old version Oracle):<\/p>\n<pre>--------------------------------------------------------------------------------\r\n| Operation                 |  Name    |  Rows | Bytes|  Cost  | Pstart| Pstop |\r\n--------------------------------------------------------------------------------\r\n| SELECT STATEMENT          |          |   225 |    4K|     35 |       |       |\r\n|  NESTED LOOPS             |          |   225 |    4K|     35 |       |       |\r\n|   TABLE ACCESS BY INDEX RO|T2        |    15 |  180 |      5 |       |       |\r\n|    INDEX RANGE SCAN       |T2_I1     |    15 |      |      1 |       |       |\r\n|   TABLE ACCESS BY INDEX RO|T1        |     3K|   23K|      2 |       |       |\r\n|    INDEX RANGE SCAN       |T1_I1     |     3K|      |      1 |       |       |\r\n--------------------------------------------------------------------------------<\/pre>\n<p>It\u2019s unfortunate that in this case the Rows values for the access to <em><strong>t1<\/strong><\/em> and its index are completely wrong, but you can appreciate that this plan is telling us that <em>\u201cfor each row we return from <strong>t2<\/strong> we do an index range scan of <strong>t1_i1<\/strong> to get (15, rather than the 3,000) rows from <strong>t1<\/strong>\u201d<\/em> and that the cost of the index range is 1 (one leaf block) while the cost of the table access (including its child index access) is 2 (the data in <em><strong>t1<\/strong><\/em> happens to be very well clustered). In this case it\u2019s clear, then, that the cost of the nested loop is: 5 (t2 access) + 2 (t1 access) * 15 (predicted executions of t1 access) = 35.<\/p>\n<p>So, line 6 in our original plan is reporting (with bug fixes) 8i numbers for the operations, even though the shape of the execution plan has changed and the operational mechanism is different. Technically we could argue that line 6 should look like this:<\/p>\n<pre>---------------------------------------------------------------------------------------\r\n| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------------------\r\n|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |     8 |     1   (0)| 00:00:01 |\r\n---------------------------------------------------------------------------------------<\/pre>\n<p>Each time we execute it, we will acquire one row, and we will visit one block. Of course, we will be executing this line 225 times, which means that applying the rule\u00a0<em>\u201caccumulate children up to their parent\u201d<\/em> will produce a ridiculous cost \u2013 we\u2019re caught in a trap between the optimizer doing its traditional arithmetic to determine the resource requirement and the execution plan showing the strategy the execution engine will actually take. Arguably we have a choice between the plan that Oracle shows us, and the following alternative which is, in some respects, a little more truthful:<\/p>\n<pre>---------------------------------------------------------------------------------------\r\n| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT              |       |   225 |  4500 |    46   (0)| 00:00:01 |\r\n|   1 |  NESTED LOOPS                 |       |   225 |  4500 |    46   (0)| 00:00:01 |\r\n|   2 |   NESTED LOOPS                |       |   225 |  2700 |    31   (0)| 00:00:01 |\r\n|   3 |    TABLE ACCESS BY INDEX ROWID| T2    |    15 |   180 |    16   (0)| 00:00:01 |\r\n|*  4 |     INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |\r\n|*  5 |    INDEX RANGE SCAN           | T1_I1 |    15 |       |     1   (0)| 00:00:01 |\r\n|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |     8 |     1   (0)| 00:00:01 |\r\n---------------------------------------------------------------------------------------<\/pre>\n<p>In this version of the plan I\u2019ve adjusted the cost (and bytes) in line 2, to reflect the execution of the index range on <em><strong>t1_i1<\/strong><\/em>; and I\u2019ve corrected the rows and cost (and bytes) in line 6 to reflect the execution of the table access.<\/p>\n<p>What I haven\u2019t done is change the cost in line 1 even though you could argue that the cost should be 256 (calculated as 31 + 225 * 1). The trouble is that the cost of 1 in line 6 isn\u2019t truthful. Because the data in <em><strong>t1<\/strong><\/em> is very well clustered we know we\u2019re not going to read a block on every execution of line 6, we may read a block for the first row of a set of 15 but we can be confident that the next 14 executions will visit the same block in memory \u2013 on average the cost of acquiring one row from <em><strong>t1<\/strong><\/em> by rowid should be 1\/15. <em>(Note: in fact the optimizer does do arithmetic to several decimal places, but rounds the figures for reporting in the execution plan; this is why you can sometimes see figures that suggest things like 2 \u00a0* 3 = 5, internally it might have been 1.8 * 2.6 = 4.68.)<\/em><\/p>\n<p>As this example shows there\u2019s a presentation problem associated with execution plans. The optimizer tries to work out a resource cost and a mechanical strategy for getting the data you want, but the way the resource cost is calculated isn\u2019t always visually compatible with the shape of the execution plan; so when you try to create some simple rules in an attempt to understand the numbers you will find some anomalies. When the basic strategy of <em>\u201caccumulate the children to the parent\u201d<\/em> (including the bit about multiplying up correctly) doesn\u2019t make sense a little history and a little flexibility may give you a clue about how to re-interpret the numbers.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In part 8 we looked at a very simple execution plan involving a hash join; but that example was too simple to give us the full flavour of the arithmetic involved in Oracle\u2019s predictions because every operation executed just once. We need to see some plans where each execution of a parent operation could requires multiple executions of its child&hellip;<\/p>\n","protected":false},"author":101205,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[4783],"coauthors":[],"class_list":["post-73188","post","type-post","status-publish","format-standard","hentry","category-oracle-databases","tag-execution-plans"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73188","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/101205"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=73188"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73188\/revisions"}],"predecessor-version":[{"id":91676,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73188\/revisions\/91676"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=73188"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=73188"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=73188"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=73188"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}