{"id":73144,"date":"2015-11-04T10:00:51","date_gmt":"2015-11-04T10:00:51","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/basics-of-the-cost-based-optimizer-part-6\/"},"modified":"2021-07-14T13:07:19","modified_gmt":"2021-07-14T13:07:19","slug":"basics-of-the-cost-based-optimizer-part-6","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/basics-of-the-cost-based-optimizer-part-6\/","title":{"rendered":"Basics of the Cost Based Optimizer &#8211; part 6"},"content":{"rendered":"<p>In <a href=\"https:\/\/allthingsoracle.com\/?p=5741\">part 5 of this series<\/a> I started working through a list of queries designed to demonstrate ways in which the optimizer can produce bad cardinality estimates. In this installment I&#8217;m going work through the remainder of the list. The outstanding queries are as follows:<\/p>\n<pre>select count(*) from t1 where trunc_5 &gt; (\r\n\tselect max(trunc_5) from t1 where mod_200 = 0\r\n);\r\n\r\nselect count(*) from t1 where five_per_sec &gt; sysdate - 60\/86400;\r\n\r\nexecute dbms_lock.sleep(10)\r\nselect count(*) from t1 where five_per_sec &gt; sysdate - 10\/86400;\r\n\r\n\r\nvariable b1 number\r\nexec :b1 := 10\r\nselect count(*) from t1 where skewed = :b1;\r\n\r\nexec :b1 := 100\r\nselect count(*) from t1 where skewed = :b1;\r\n<\/pre>\n<p>The first query demonstrates the re-appearance of one of the optimizer\u2019s favourite guesses &#8211; 5% &#8211; which is frequently used as an estimate of the effect of a subquery. The next pair (with the 10 second sleep in between) demonstrates the potentially catastrophic side effects of statistics going out of date. The final pair introduces us to some of the problems of bind variables and skewed data distributions.<\/p>\n<p>Remember that I\u2019m running all these demonstrations on 11.2.0.4 and, so far, haven\u2019t generated histograms on any of the columns referenced by the queries.<\/p>\n<h3>Subquery effects<\/h3>\n<p>Here\u2019s the default execution plan from the first query in our list:<\/p>\n<pre>-----------------------------------------------------------------------------\r\n| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n-----------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT     |      |       |       |  1308 (100)|          |\r\n|   1 |  SORT AGGREGATE      |      |     1 |     5 |            |          |\r\n|*  2 |   TABLE ACCESS FULL  | T1   | 50000 |   244K|   658   (9)| 00:00:04 |\r\n|   3 |    SORT AGGREGATE    |      |     1 |     9 |            |          |\r\n|*  4 |     TABLE ACCESS FULL| T1   |  5000 | 45000 |   650   (8)| 00:00:04 |\r\n-----------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n   2 - filter(\"TRUNC_5\"&gt;)\r\n   4 - filter(\"MOD_200\"=0)\r\n<\/pre>\n<p>The query identified 195 rows, but the prediction was 50,000 \u2013 which is 5% of the starting 1,000,000.<\/p>\n<p>The error doesn\u2019t really matter in this simple example, but you can imagine that in a more complex query \u2013 one that joins the result set from line 2 to another table \u2013 an estimate of 50,000 would encourage the optimizer to do a hash join to the next table, while an estimate that was closer to the actual 195 might encourage the optimizer to do a nested loop.<\/p>\n<p>Although there are a few options available there isn\u2019t a nice way to address this problem because that 5% is hard coded into the optimizer.<\/p>\n<p>First, of course, you could simply work out an optimum plan and create a complete set of hints to enforce that plan (perhaps improving the robustness of the hinting by generating the equivalent SQL Plan Baseline after getting the plan you want). If the query were simple enough you might be able to get away with a suitably populated <em><strong>\/*+ leading() *\/<\/strong><\/em> hint in the main query block and the <em><strong>\/*+ no_unnest push_subq *\/<\/strong><\/em> pair in the subquery.<\/p>\n<p>Alternatively, if you\u2019re licensed for the Performance Pack, you could use the <em><strong>dbms_sqltune<\/strong><\/em> package (perhaps through the graphic OEM interface) to create an SQL Profile for the query. This will exercise the query and discover that the default 5% is a bad estimate and can create a set of <strong><em>\u201copt_estimate()\u201d<\/em><\/strong> hints to help the optimizer calculate better estimates, which ought to result in a better plan being selected.<\/p>\n<p>Finally, you may be lucky enough to be able to add a \u201csimple\u201d <em><strong>cardinality()<\/strong><\/em> hint to the SQL that gets the optimizer started with a suitable estimate and allows it to generate reasonable estimates for the rest of the plan. Unfortunately the <em><strong>cardinality()<\/strong><\/em> hint is not quite supported, not really documented, and is much more subtle (and therefore a lot harder to use) than the few notes available on the Internet might suggest.<\/p>\n<p>In my trivial case I could simply add the hint <em><strong>\/*+ cardinality(t1 195) *\/<\/strong><\/em> to the SQL:<\/p>\n<pre>select\r\n        \/*+ cardinality(t1 195) *\/\r\n        count(*)\r\nfrom    t1\r\nwhere   trunc_5 &gt; (\r\n                select max(trunc_5) from t1 where mod_200 = 0\r\n        )\r\n;\r\n-----------------------------------------------------------------------------\r\n| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n-----------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT     |      |       |       |  1308 (100)|          |\r\n|   1 |  SORT AGGREGATE      |      |     1 |     5 |            |          |\r\n|*  2 |   TABLE ACCESS FULL  | T1   |   195 |   975 |   658   (9)| 00:00:04 |\r\n|   3 |    SORT AGGREGATE    |      |     1 |     9 |            |          |\r\n|*  4 |     TABLE ACCESS FULL| T1   |  5000 | 45000 |   650   (8)| 00:00:04 |\r\n-----------------------------------------------------------------------------\r\n<\/pre>\n<p>This hasn\u2019t changed the cost of the query, but it has changed the cardinality estimate at line 2 as required. Oracle doesn\u2019t use the <em><strong>cardinality()<\/strong><\/em> internally \u2013 if you check the optimizer\u2019s trace file (event 10053) you\u2019ll see that the hint has been translated to an <em><strong>opt_estimate()<\/strong><\/em> hint of the form:<\/p>\n<pre>opt_estimate(table t1@sel$1 rows=195.000000)\r\n<\/pre>\n<p>Like all \u201cmicro-management\u201d hinting, though, you typically need far more cardinality hints than you might first imagine. As it stands this hint is a \u201csingle table access\u201d hint but in more complex cases you may have to include lots of hints to say things like <em>\u201cif you start at t1 it will give you 195 rows, but if you start at table t2 it will give you 5000 rows\u201d<\/em>, then you will probably have to use the \u201cjoin cardinality\u201d version of the hint to say things like: <em>\u201cif you start t1 -&gt; t2 that join will give you 800 rows, but t1 -&gt; t3 will give you 650 rows\u201d<\/em>. Generally speaking it\u2019s a bit of a pain to try and work out all the cardinality hints you need to ensure that the plan you need will appear. It\u2019s best to avoid the technique and stick with more \u201ctraditional\u201d hinting.<\/p>\n<h3>The passage of time<\/h3>\n<p>To run my test I created the data, gathered stats, then ran various queries and pulled their execution plans from memory. By the time I ran the query that compared column <em><strong>five_per_sec<\/strong><\/em> with <strong><em>&#8220;sysdate \u2013 60 seconds&#8221;<\/em><\/strong> a few seconds had already passed between creating the data and running the query. The time component on the highest value of <em><strong>five_per_sec<\/strong><\/em> was 17:15:17, while the time I ran the query had moved on to 17:15:31, a gap of 14 seconds \u2013 which meant there were 46 seconds of data in the table matching the predicate <em>\u201cfive_per_sec &gt; sysdate \u2013 60\/86400\u201d<\/em>. Here\u2019s the execution I got when running the query:<\/p>\n<pre>---------------------------------------------------------------------------\r\n| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT   |      |       |       |   752 (100)|          |\r\n|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |\r\n|*  2 |   TABLE ACCESS FULL| T1   |   230 |  1840 |   752  (21)| 00:00:04 |\r\n---------------------------------------------------------------------------\r\n<\/pre>\n<p>As you can see the optimizer\u2019s estimate is 230 rows \u2013 a perfect match for 46 seconds at five rows per second.<\/p>\n<p>What happens, though if I wait a further 10 seconds (time is now 17:15:41) and then query for <em><strong>sysdate \u2013 10 \/ 86400<\/strong><\/em>? I\u2019m querying for data later than 17:15:31, and there is no data in the table matching that predicate, but what does the optimizer expect to find:<\/p>\n<pre>---------------------------------------------------------------------------\r\n| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT   |      |       |       |   752 (100)|          |\r\n|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |\r\n|*  2 |   TABLE ACCESS FULL| T1   |     5 |    40 |   752  (21)| 00:00:04 |\r\n---------------------------------------------------------------------------\r\n<\/pre>\n<p>The optimizer&#8217;s estimate is 5 rows. A few extra tests will show you that this is the number of rows that match \u201ccolumn = constant\u201d \u2013 as soon as you execute a query with a range-based predicate that the optimizer can see is asking for data that is entirely outside the known low\/high range the arithmetic falls back to the arithmetic for \u201ccolumn = constant\u201d.<\/p>\n<h3>The passage of time \u2013 case study<\/h3>\n<p>In my model the error was not too bad \u2013 if the data had carried on arriving at the same rate while I was doing my tests there would have been 50 rows in the last 10 seconds, so 5 wouldn\u2019t (necessarily) be a terrible estimate and might not have caused problems. But the basic behaviour can have a terrible impact, as the following production example shows.<\/p>\n<p>Data arrives at the rate of roughly 300 rows per minute (5 per second) and a very important polling job runs a query every few minutes to identify any data that has arrived in the last 15 minutes (and is still in a certain state) \u2013 typically identifying about 2,000 to 4,000 rows. The query is fairly simple and includes the time-critical predicate:<\/p>\n<pre>where arrival_time &gt; sysdate \u2013 1\/96\r\n<\/pre>\n<p>Because of the rate of data change the automatic stats collection job runs every night on this table so the statistics are as accurate as a simple, default, setup could make them.<\/p>\n<p>Every few days &#8211; apparently completely at random &#8211; the polling job hits a terrible performance problem and takes several minutes to run instead of a couple of seconds. Not surprisingly this has an impact on the business operation and makes a big difference to how the users feel about the system. What\u2019s going on?<\/p>\n<p>Just 15 minutes after the stats have been collected the optimizer\u2019s estimate for this query is going to drop from a few thousand to a mere 5 (because, on average, there are 5 rows for any give value of <em><strong>arrival_time<\/strong><\/em>). If this happens the plan changes catastrophically because the optimizer decides that it can find a really good strategy for dealing with 5 rows that turns out to be a very bad strategy for the several thousand rows that are really there.<\/p>\n<p>The query is executed every few minutes, so it usually gets re-optimised \u2013 producing a good execution plan &#8211; very shortly after the stats have just been collected. Most days the cursor then stays in memory and is constantly re-used for the rest of the day; so the optimizer never needs to re-optimize the statement so never has a chance to create the bad plan. Unfortunately something happens occasionally to create a huge demand for memory in the shared pool and the good plan is flushed from memory and the query is re-optimised several hours after the stats have been collected \u2013 producing the bad \u201conly 5 rows\u201d plan \u2013 and the query then runs badly until the next day or (as the client discovered one day) they gather stats on the table.<\/p>\n<p>Columns that are time (or sequence) based can introduce all sorts of odd problems \u2013 though you may have to be a little unlucky to suffer badly as a consequence; but if you have any code which is always looking for \u201cthe last few rows\u201d this problem is one to watch out for. There are three basic strategies you could follow if you see the problem:<\/p>\n<ul>\n<li>Hard-code the good execution plan with hints (or an SQL Plan Baseline \/ Stored Outline).<\/li>\n<li>Set up a job to gather stats on the critical column and invalidate all related cursors as frequently as necessary (every 5 minutes would be appropriate for the case study).<\/li>\n<li>Write some code that calls <em><strong>dbms_stats.set_column_stats()<\/strong><\/em> to set the high-value on the column stats to a future value that is far enough forward to avoid the catastrophic change in plan (but not so far that it causes the optimizer to err in the opposite direction). This is really just a variant of the second option, but (a) it\u2019s cheaper to execute and (b) you may be able to execute it far less frequently than you would the job to gather stats.<\/li>\n<\/ul>\n<h3>Bind Variables<\/h3>\n<p>The last two queries in the list introduce the problems of bind variables. We will be visiting this topic in more detail in a future installment in the series, the only point I want to make here is that skewed data patterns and bind variables do not mix \u2013 you have to help the optimizer.<\/p>\n<p>Looking at the way the data was generated we can work out that the first query of the pair should return 21 rows and the second should return 201 rows (there are 2N + 1 rows for &#8220;<em><strong>skewed<\/strong><\/em> = N&#8221;), but the execution plan in both cases predicts 1,000 rows:<\/p>\n<pre>---------------------------------------------------------------------------\r\n| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT   |      |       |       |   666 (100)|          |\r\n|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |\r\n|*  2 |   TABLE ACCESS FULL| T1   |  1000 |  4000 |   666  (11)| 00:00:04 |\r\n---------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n   2 - filter(\"SKEWED\"=:B1)\r\n<\/pre>\n<p>This is because we did nothing to let the optimizer know that the distribution of the data values was hugely uneven; so the available statistics simply tell it that there are 1,000,000 rows in total with 1,000 distinct values, so any one value will return 1,000,000 \/ 1,000 = 1,000 rows. We would have got the same result even if we have used the literal values 10 and 100 in the queries rather than using bind variables.<\/p>\n<p>To help Oracle understand the skew we have to create a histogram on the column. Unfortunately I have 1,000 distinct values in the column and in 11g a histogram consists of a maximum of 255 rows (equating to 254 buckets) so Oracle is constrained to generating a height-balanced histogram on the column. In fact, although we can see that that variation between the scarcest data and the most common data is enormous (1 row for skewed = 0 and 1999 rows for skewed = 999) left to \u201cauto-detect\u201d the need for a histogram Oracle doesn\u2019t even create one. And if we force Oracle to construct a histogram of 254 buckets it doesn\u2019t capture any popular values (with 1M rows and 254 buckets a value would have to appear in about 4,000 rows before it could register as popular and, if you&#8217;re unlucky, might fail to be noticed even when it appeared in around 8,000 rows). So in 11g a \u201cgathered\u201d histogram won\u2019t help.<\/p>\n<p>To continue the demonstration I\u2019ve switched to 12c. Again if I tell Oracle to use \u201csize auto\u201d to gather a histogram on the column Oracle decides to use 254 buckets then decides it\u2019s not worth it. However in 12c I can have up to 2048 rows in the histogram and when I tell Oracle to use the maximum it creates a frequency histogram with one row for each of the 1,000 values in the column. This is the plan I then get for the two queries:<\/p>\n<pre>---------------------------------------------------------------------------\r\n| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |\r\n---------------------------------------------------------------------------\r\n|   0 | SELECT STATEMENT   |      |       |       |   665 (100)|          |\r\n|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |\r\n|*  2 |   TABLE ACCESS FULL| T1   |    21 |    84 |   665  (11)| 00:00:01 |\r\n---------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n   2 - filter(\"SKEWED\"=:B1)\r\n<\/pre>\n<p>It\u2019s the same plan for both queries, but at least it predicts the correct number of rows for the first execution. The optimizer manages to do this because on the first execution the query has to be optimized and the optimizer \u201cpeeks\u201d at the actual value of the bind variable to help it choose the plan, so the query is optimized as if the predicate read: <em>\u201cskewed = 10\u201d<\/em>, and the histogram has the information that there are 21 rows for the value 10.<\/p>\n<p>Unfortunately when we execute the query with a new value for the bind variable the optimizer recognizes that it has seen the exact same text in the very recent past and discovers that it has an execution plan in memory that is legal for the query, and simply re-uses it.<\/p>\n<p>Bind peeking and re-using execution plans is a benefit because we don\u2019t compete for and use up resources to optimize every single query execution; but when the data distribution is so uneven that different input values produce dramatically different sizes of result set, plan re-use becomes a liability and we may need to re-optimize statements that are visibly identical but mechanically very different.<\/p>\n<p>This conflict between bind variables and histograms is a long-standing issue, and Oracle has introduced many strategies over the last 20 years to limit the damaging side effects of the conflict; ultimately, though, the application developer needs to be aware of the threat and identify the special cases where it is necessary to get the database and the client code co-operating to deal with extremely skewed data distributions.<\/p>\n<h3>Summary<\/h3>\n<p>I\u2019ve covered three \u201charder\u201d problems of optimization. The first is that subqueries introduce a \u201cfixed guess\u201d of 5% that is hard to workaround unless you\u2019re prepared to hint an actual execution path, use a poorly documented <em><strong>cardinality()<\/strong><\/em> hint, or use the tuning tool to generate an SQL Profile (set of <em><strong>opt_estimate()<\/strong><\/em> hints).<\/p>\n<p>The second is that a column based on time or sequence values can introduce random instability if the most important queries are to find \u201cthe most recent data\u201d. The algorithm that the optimizer uses when a query uses an \u201cout of range\u201d value to optimize a range-based predicate could result in a very bad cardinality estimate, leading to a bad execution plan. Either you hint (or create an SQL Plan Baseline for) the execution plan you need, or you take action to keep the <em><strong>high_value<\/strong><\/em> recorded for the critical column at a suitably recent (or future) value that the optimizer always gets the right plan when re-optimising the query.<\/p>\n<p>Finally we have to recognize that bind variables and skewed data patterns do not mix &#8211; even in the presence of histograms. We will say more about this in a future installment, but for the moment we note that if the same query text could produce extreme variations in the volume of data reported then you probably need to create a histogram on the critical column but you then need to ensure that your code co-operates with the database (perhaps by using literals for the critical predicates) to make most effective use of that histogram.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"http:\/\/jonathanlewis.wordpress.com\/cbo-series\/\">\u2013&gt; Catalogue of current articles in CBO series.<\/a><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In part 5 of this series I started working through a list of queries designed to demonstrate ways in which the optimizer can produce bad cardinality estimates. In this installment I&#8217;m going work through the remainder of the list. The outstanding queries are as follows: select count(*) from t1 where trunc_5 &gt; ( select max(trunc_5) from t1 where mod_200 =&hellip;<\/p>\n","protected":false},"author":101205,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[48347,48383],"coauthors":[],"class_list":["post-73144","post","type-post","status-publish","format-standard","hentry","category-oracle-databases","tag-12c","tag-cost-based-optimizer"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73144","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/101205"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=73144"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73144\/revisions"}],"predecessor-version":[{"id":91640,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73144\/revisions\/91640"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=73144"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=73144"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=73144"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=73144"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}