Dynamic Sampling (I), an Introduction – Part 2

If you follow me on my blog you might wonder if I let you get away without an actual test case, and of course I won’t do so. Here is a simple test case that demonstrates Dynamic Sampling in action: set echo on timing on time on drop table t; purge table t; — Create a sample table create table

If you follow me on my blog you might wonder if I let you get away without an actual test case, and of course I won’t do so. Here is a simple test case that demonstrates Dynamic Sampling in action:

Note: Although for the sake of simplicity this test case uses the SQL*Plus AUTOTRACE feature to display the execution plan along with the actual execution of a SQL statement it should be mentioned that AUTOTRACE does have its quirks and is not always reliable, in particular when dealing with bind variables. Although it seems to be a handy feature for a quick comparison of SQL statement performance I strongly recommend using DBMS_XPLAN.DISPLAY_CURSOR (or Real-Time SQL Monitoring if the corresponding license is available) instead for such purposes as it offers a completely different level of insight for SQL statement performance measurement

Output snippet:

Looking at the relevant part of the output generated (here taken from a version) it becomes obvious that:

  • We have a confirmation telling us that Dynamic Sampling results have been used for optimization
  • The dynamic sampling result is reasonable: 108,000 rows estimated vs. 100,000 actual rows although we didn’t gather any statistics

Note: For your entertainment you can repeat the experiment and set the
Dynamic Sampling level to 0 to see what the result looked like if no
Dynamic Sampling was used. In such cases the Cost-Based Optimizer reverts to hard-coded defaults that might be way off from the actual statistics.

It’s important to understand that one of the most critical components of the cost calculation is the cardinality of the different row sources and joins. In general you can assume: If the cardinality estimates are in the right ballpark the Cost-Based Optimizer most of the time will come up with a reasonable execution plan – or putting it the other way around: It’s very likely that the resulting execution plan will be suboptimal if the cardinality estimates are not in the right ballpark

Checking the resulting trace file and searching for “/* OPT_DYN_SAMP */” we can identify the following snippet that shows the Dynamic Sampling details:

I won’t go into too many details in this part of the series but you can see from these lines:

that Dynamic Sampling has been used to determine the size of the object in blocks (something that the optimizer had used anyway even without Dynamic Sampling by the way), the original cardinality of the segment before applying the filter predicate and the selectivity of the filter predicate to arrive at the final cardinality estimate of 108,074.

Note: This example was explicitly crafted to be Dynamic Sampling “friendly”, because we have a uniform data distribution across the table and the filter predicate applies to many rows (10 percent). We’ll see in one of the upcoming parts of the series what happens if the data distribution is more realistic and potentially less friendly to the optimizer.

I won’t leave you without the following odd result, which can be obtained by slightly modifying above test case:

Now isn’t that an astonishing result: Although Dynamic Sampling has been performed on the highest available level which can be told from the “Notes” section (as outlined above level 10 means that the whole segment has been read during sampling, and since the level is mentioned this also indicates that this time a 11g version was used) and the results haven’t been rejected (otherwise we wouldn’t see it mentioned in the notes section) the cardinality estimate of 10 shown in the plan is way off in comparison to the actual one (still 100,000 as before, the modified where clause didn’t change the outcome) – so although Dynamic Sampling should help the optimizer to come up with better estimates, it didn’t quite work out in this particular case.

Note: I don’t show you yet how I modified the test case. I only show you the result which also uses a slightly modified query. It will become obvious in a later part of the series why the query had to be modified to arrive at these odd results of Dynamic Sampling.

Clearly there is something going on I haven’t explained yet and more analysis is necessary. So, as with almost any feature you can see that it doesn’t always work as expected and therefore it is important to gain a better understanding how it works and why it might not produce the expected results.

To be continued…

Go to Dynamic Sampling (I), an Introduction – Part 1