Understanding SQL Query Parsing: Part 4 – Understanding Bind Variable Peeking

In the 3rd part of this series, I introduced you to the concept of the bind variables. We also learned their importance in regard to cursors, i.e. how with the presence of bind variables, cursor reparsing can be reduced significantly. But as is famously said in the world of performance tuning, there is nothing black or white: it’s all grey…. Continue Reading →

In the 3rd part of this series, I introduced you to the concept of the bind variables. We also learned their importance in regard to cursors, i.e. how with the presence of bind variables, cursor reparsing can be reduced significantly. But as is famously said in the world of performance tuning, there is nothing black or white: it’s all grey.

Similarly, it’s not all straightforward with the usage of bind variables. In this instalment of this article series, we will understand the potential issues surrounding bind variables and how to handle them.

Literals – when they are safe to be used

As discussed in the previous article, using bind variables is good when we are expecting the cursors not to require reparsing. It means that the cursors, with different bind variable values, would also need to share their execution plans. For example, what would happen if we select two records from the EMP table in the example Scott schema, selecting with the EMPNO? Since EMPNO is a Primary Key column, it is indexed and unique. This means that there is no chance of having one employee code occurring more than once. In this scenario, use of bind variables will definitely be the right thing to do. And because of the bind variables, we will end up with just one cursor being created in the memory.

Let’s see what’s really happening in the database. First we execute two different SQL statements in the Scott session and then we’ll check how many cursors are created by the database for those two different employee codes.

First, executing the queries using bind variable in the Scott schema.

SQL> show user
USER is "SCOTT"
SQL> show parameter sharing

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing			     string	 EXACT
SQL> variable eno number
SQL> exec :eno:=7369;

PL/SQL procedure successfully completed.

SQL> select * from emp where empno=:eno;

     EMPNO ENAME      JOB	       MGR HIREDATE	    SAL       COMM     DEPTNO

---------- ---------- --------- ---------- --------- ---------- ----------- ---------
      7369 SMITH      CLERK	      7902 17-DEC-80	    800	                   20

SQL> exec :eno:=7499

PL/SQL procedure successfully completed.

SQL> select * from emp where empno=:eno;

     EMPNO ENAME      JOB	       MGR HIREDATE	    SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7499 ALLEN      SALESMAN	      7698 20-FEB-81	   1600        300         30

Now, in a different session using the Sys user, let’s check the child cursor count:

SQL> show user    
USER is "SYS"
SQL> col child_number Heading 'CHILD|NUMBER'
SQL> col object_name format a6
SQL> col operation format a16
SQL> col options format a15
SQL> select address,hash_value,child_number,
operation,options,object_name
from v$sql_plan
where (address,hash_value) in
(select address,hash_value
from v$sql
where sql_text like '%select * from emp where%');  2    3    4    5    6    7  
				 CHILD
ADDRESS 	 HASH_VALUE	NUMBER OPERATION	OPTIONS 	OBJECT
---------------- ---------- ---------- ---------------- --------------- ------
000000006FF14E88 2367053566	     0 SELECT STATEMENT
000000006FF14E88 2367053566	     0 TABLE ACCESS	BY INDEX ROWID	EMP
000000006FF14E88 2367053566	     0 INDEX		UNIQUE SCAN	PK_EMP

We can see that for both the statements just one cursor is created and the plan, as expected, is an index plan because a repeat of the same employee number is just not possible.

But what would happen if the data distribution didn’t favour the reuse of the same execution plan for multiple bind variable values? What if for one value, the database would be better off using a distinct plan rather than one originally used for another value? Incidentally, this is the point where the use of bind variables becomes less useful – a phenomenon known as Bind Variable Peeking!

Bind Variables Peeking – When bind variables strike!

As mentioned earlier, use of bind variables is good if reparsing needs to be avoided. But if we change the data distribution, demanding distinct execution plans for different bind variable values, the benefit gained from the absence of reparsing starts impacting performance in an adverse manner.

Let’s explore this by creating a data set where the occurrence of one value is going to be very high and another is much lower, thus ensuring different plans must be used for both values. The column is going to be indexed. Since indexes are generally preferred by the database optimizer when the occurrence of the searched predicate value is much lower (this is known as Cardinality), and utilises the access path of Full Table Scan if the cardinality is very high, we shall use this as the basis of our demonstration. With a particular value (1), we will force the database to pick up the Index access path and with another value (99), we’ll see that the database uses the Full Table Scan to execute the query. So, let’s start by creating the table, creating an index on the ID column, and then gathering the statistics on the table with histograms.

SQL>  create table new_tab
    as
    select decode(rownum,1,1,99) ID,
           all_objects.*
      from all_objects
    /  2    3    4    5    6  

Table created.

SQL> create index newt_idx on new_tab (id);

Index created.

SQL> begin
           dbms_stats.gather_table_stats
            ( ownname    => USER,
              tabname     => 'NEW_TAB',
              method_opt => 'for all indexed columns size 254',
              cascade      => TRUE
            );
    end;  2    3    4    5    6    7    8  
  9  /

PL/SQL procedure successfully completed.

Let’s check the data distribution in the column ID.

SQL> select id, count(id) from new_tab group by id;

	ID  COUNT(ID)
---------- ----------
	 1	    1
	99	84775

Now it’s certain that if we are going to select value 1, the database must choose the Index access path since it’s occurrence is much lower (just 1 occurrence in 84k rows). Let’s confirm our assumptions by executing the queries, but without using the bind variable. The reason to do so is that we want to enforce hard parsing for each execution of the query by passing the literal constants. Later, we will compare these executions along with the usage of the bind variables.

SQL> set autot trace exp
SQL> select * from new_tab where id=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 470836197

--------------------------------------------------------------------------------
-----

| Id  | Operation		    | Name  | Rows  | Bytes | Cost (%CPU)| Time
    |

--------------------------------------------------------------------------------
-----

|   0 | SELECT STATEMENT	    |	    |	 15 |  1515 |	  2   (0)| 00:00
:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| NEW_TAB     |	 15 |  1515 |	  2   (0)| 00:00
:01 |

|*  2 |   INDEX RANGE SCAN	    | NEWT_IDX |	 15 |	    |	  1   (0)| 00:00
:01 |

--------------------------------------------------------------------------------
-----


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=1)

SQL> select * from new_tab where id=99;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 | 84753 |  8359K|   348   (1)| 00:00:05 |
|*  1 | TABLE ACCESS FULL |NEW_TAB| 84753 |  8359K|   348   (1)| 00:00:05 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=99)

So as expected with literals being used, the database is choosing the correct execution plan. But what happens when we bring bind variables into the mix? Let’s check now by using the 10053 trace event and formatting the raw trace file with TKPROF.

Here is an extract from the formatted trace file, leaving out the sections that are not needed in this discussion.

select * 
from
 new_tab where id=:d


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          4          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 90  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  TABLE ACCESS BY INDEX ROWID NEW_TAB (cr=4 pr=0 pw=0 time=24 us cost=2 size=1515 card=15)
         1          1          1   INDEX RANGE SCAN NEWT_IDX (cr=3 pr=0 pw=0 time=17 us cost=1 size=0 card=15)(object id 88745)

SQL ID: 5p08mkb7b3m0t Plan Hash: 470836197

select * 
from
 new_tab where id=:d


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       35      0.00       0.00          0         77          0         511
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       37      0.00       0.00          0         77          0         511

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 90  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
       511        511        511  TABLE ACCESS BY INDEX ROWID NEW_TAB (cr=77 pr=0 pw=0 time=4103 us cost=2 size=1515 card=15)
       511        511        511  INDEX RANGE SCAN NEWT_IDX  (cr=36 pr=0 pw=0 time=14168 us cost=1 size=0 card=15)(object id 88745)

We can clearly see in the bold sections of the output that the query was hard-parsed upon first execution, (this is reflected in the Misses in the Library Cache shown with value 1) and that the correct plan was chosen. But for the 2nd execution, the query selected the index plan despite it not being needed. This is due to soft parsing happening and the cursor not getting reparsed (the number of misses in the library cache is 0). Clearly this is a case where bind variables did much less good.

Now, you may definitely want to know if we can stop this behaviour from happening. Unfortunately no is be the answer, since this is a default feature and is controlled by an internal parameter. Fortunately, from Oracle 11g onwards we do have a feature that can take care of this behaviour – Adaptive Cursor Sharing.

Conclusion

It’s always good to check all aspects of a feature before considering them, because no solution is one-size-fits-all. In the next article, we will explore the Adaptive Cursor Sharing feature and learn how it addresses Bind Variable Peeking.