{"id":73388,"date":"2011-08-19T09:30:30","date_gmt":"2011-08-19T09:30:30","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/can-plsql-be-faster-than-sql\/"},"modified":"2021-07-14T13:08:11","modified_gmt":"2021-07-14T13:08:11","slug":"can-plsql-be-faster-than-sql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/can-plsql-be-faster-than-sql\/","title":{"rendered":"Can PL\/SQL be Faster than SQL?"},"content":{"rendered":"<p>PL\/SQL and SQL both are incredibly powerful languages that allow us to do some pretty amazing things with data. If you\u2019ve been around the Oracle community for some time you\u2019ve probably realized, either through your own experience or by reading articles, if you are able to accomplish a task using a single SQL statement then that is preferred over using procedural PL\/SQL code that would run slower. I fully endorse this approach and both use and reiterate it daily to my teams. Here are but a few of the benefits of this brand of thinking:<\/p>\n<ol>\n<li>It forces developers to think in \u201csets\u201d. The Oracle database works incredibly efficiently with sets of data and learning how to harness that power will benefit developers and customers alike.<\/li>\n<li>It promotes an exploration and understanding of the huge wealth of SQL functions available in Oracle.<\/li>\n<li>It allows developers to get creative with their solutions while at the same time writing elegant and simple code that is easily understood and maintained.<\/li>\n<\/ol>\n<p>So, should we always use SQL over PL\/SQL if the SQL language permits us to do so? As you can probably guess, there are situations where using PL\/SQL can be the better option. Such as in life, in technology the only guarantee is that there are no guarantees.<\/p>\n<p>The following is an example of a real-world scenario that I have encountered a few times already in my career and expect to see a few times more before I hang up the keyboard and mouse for a set of golf clubs.<\/p>\n<p>Consider the following two tables:<\/p>\n<pre>SQL&gt; DESC EMPLOYEES\r\n Name                                    Null?    Type\r\n --------------------------------------- -------- ----------------\r\n EMPLOYEE_ID                             NOT NULL NUMBER(22)\r\n EMPLOYEE_SALARY                                  NUMBER(15,2)\r\n\r\nSQL&gt; DESC NEW_SALARY_INFO\r\n Name                                    Null?    Type\r\n --------------------------------------- -------- ----------------\r\n EMPLOYEE_ID                             NOT NULL NUMBER(22)\r\n SALARY                                           NUMBER(15,2)<\/pre>\n<p>Some important facts about these tables:<\/p>\n<ul>\n<li>There is a primary key on the EMPLOYEE_ID column of the EMPLOYEES table<\/li>\n<li>There are no keys or indexes on the NEW_SALARY_INFO table.<\/li>\n<li>The NEW_SALARY_INFO table is in 3rd party vendor system that you cannot change as it would result in the loss of support from the vendor<\/li>\n<li>The EMPLOYEES table and NEW_SALARY_INFO tables are \u201cbig\u201d, having 1,000,000 rows and approximately 4,000,000 rows respectively<\/li>\n<li>Not all EMPLOYEE_ID values in the EMPLOYEES table exist in the NEW_SALARY_INFO table<\/li>\n<\/ul>\n<p>You have been tasked with updating the employee salaries in the EMPLOYEES table with the maximum salary for the same employee, based on EMPLOYEE_ID, in the NEW_SALARY_INFO table. If no record exists in the NEW_SALARY_INFO table for an employee, that employee\u2019s data should remain the same.<\/p>\n<p>At first glance, this seems relatively simple. We\u2019ll just use an UPDATE statement to modify the rows in the EMPLOYEES table:<\/p>\n<pre>UPDATE employees e\r\nSET    employee_salary = (\r\n                          SELECT MAX(salary)\r\n                          FROM new_salary_info nsi\r\n                          WHERE nsi.employee_id = e.employee_id\r\n\r\n                         )\r\nWHERE EXISTS (\r\n              SELECT NULL\r\n              FROM new_salary_info nsi2\r\n              WHERE nsi2.employee_id = e.employee_id\r\n             );<\/pre>\n<p>That should do it. Let\u2019s have a look at the Explain Plan for this statement.<\/p>\n<pre>PLAN_TABLE_OUTPUT\r\n--------------------------------------------------------------------------------------------\r\n| Id | Operation           | Name           | Rows | Bytes |TempSpc| Cost (%CPU)|  Time    |\r\n--------------------------------------------------------------------------------------------\r\n|  0 | UPDATE STATEMENT    |                |  999K|    17M|       |   2772M (2)| 999:59:59|\r\n|  1 |  UPDATE             | EMPLOYEES      |      |       |       |            |          |\r\n|* 2 |   HASH JOIN SEMI    |                |  999K|    17M|    23M|    7887 (1)|  00:01:35|\r\n|  3 |    TABLE ACCESS FULL| EMPLOYEES      | 1000K|    12M|       |     690 (2)|  00:00:09|\r\n|  4 |    TABLE ACCESS FULL| NEW_SALARY_INFO| 3996K|    19M|       |    2764 (2)|  00:00:34|\r\n|  5 |   SORT AGGREGATE    |                |    1 |    13 |       |            |          |\r\n|* 6 |    TABLE ACCESS FULL| NEW_SALARY_INFO|    4 |    52 |       |    2772 (2)|  00:00:34|\r\n--------------------------------------------------------------------------------------------\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n\r\n2 - access(\"NSI2\".\"EMPLOYEE_ID\"=\"E\".\"EMPLOYEE_ID\")\r\n6 - filter(\"NSI\".\"EMPLOYEE_ID\"=:B1)<\/pre>\n<p>Hmm, that doesn\u2019t look very promising at all. Look at the estimated cost and time. It makes sense though. There is no index on the NEW_SALARY_INFO table so the database will have to full scan a four million row table for each of the one million employee records to find the maximum salary.<\/p>\n<p>Well, SQL is pretty awesome, so there must be another way to do this. Maybe we can update a view?<\/p>\n<pre>UPDATE\r\n (\r\n   SELECT e.employee_id,\r\n          e.employee_salary,\r\n          NVL((\r\n               SELECT MAX(nsi.salary)\r\n               FROM new_salary_info nsi\r\n               WHERE e.employee_id = nsi.employee_id\r\n              ),\r\n              e.employee_salary) AS new_sal\r\n   FROM employees e\r\n )\r\nSET employee_salary = new_sal;<\/pre>\n<p>Again, let\u2019s have a look at the Explain Plan prior to executing this statement.<\/p>\n<pre>PLAN_TABLE_OUTPUT\r\n-------------------------------------------------------------------------------------\r\n| Id | Operation           | Name            | Rows | Bytes | Cost (%CPU)| Time     |\r\n-------------------------------------------------------------------------------------\r\n|  0 | UPDATE STATEMENT    |                 | 1000K|    12M|   2772M (2)|999:59:59 |\r\n|  1 |  UPDATE | EMPLOYEES |                 |      |       |            |          |\r\n|  2 |   TABLE ACCESS FULL | EMPLOYEES       | 1000K|    12M|     690 (2)| 00:00:09 |\r\n|  3 |   SORT AGGREGATE    |                 |    1 |    13 |            |          |\r\n|* 4 |    TABLE ACCESS FULL| NEW_SALARY_INFO |    4 |    52 |    2772 (2)| 00:00:34 |\r\n-------------------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n\r\n    4 - filter(\"NSI\".\"EMPLOYEE_ID\"=:B1)<\/pre>\n<p>Once again we\u2019ve got a very high estimated cost and time, for the same reason as the previous update. Surely we are not out of SQL options yet? No, we\u2019re not. Let\u2019s try a MERGE statement.<\/p>\n<pre>MERGE INTO employees e USING\r\n  (\r\n    SELECT   e.rowid AS e_rowid,\r\n             MAX(nsi.salary) AS new_sal\r\n    FROM     employees e,\r\n             new_salary_info nsi\r\n    WHERE    e.employee_id = nsi.employee_id\r\n    GROUP BY e.rowid\r\n  ) new_salaries\r\nON (e.rowid = new_salaries.e_rowid)\r\nWHEN MATCHED THEN UPDATE\r\n SET e.employee_salary = new_salaries.new_sal;<\/pre>\n<p>But when I try to view the Explain Plan for the MERGE, I receive the following error<\/p>\n<pre>ORA-28132: The MERGE INTO syntax does not support the security policy.<\/pre>\n<p>Curses! Foiled again! It turns out there is a VPD security policy on the EMPLOYEES table that prevents us from using the MERGE. So what now? All of our attempts at using SQL have been stymied.<\/p>\n<p>PL\/SQL to the rescue! By making use of bulk processing in PL\/SQL we can perform our task rather quickly and with minimal code. The Oracle bulk collect tells the SQL engine to fetch multiple rows at once and store them in a collection defined by the developer. With the addition of the use of the FORALL statement, this greatly reduces the amount of context switching between PL\/SQL and SQL by sending DML statements from PL\/SQL to SQL in batches rather than one at a time, resulting in dramatic performance improvements over row-by-row processing.<\/p>\n<p>The first thing we\u2019ll need to do is write a SQL query that returns all of the data we need to update every row in the EMPLOYEES table that has a matching record(s) in the the NEW_SALARY_INFO table, keeping in mind that we need the maximum salary. This is rather straightforward.<\/p>\n<pre>SELECT   e.employee_id employee_id,\r\n         MAX(nsi.salary) new_salary\r\nFROM     employees e,\r\n         new_salary_info nsi\r\nWHERE    e.employee_id = nsi.employee_id\r\nGROUP BY e.employee_id;<\/pre>\n<p>That\u2019s it. That gives us exactly what we are looking for. Now we can turn that query into a PL\/SQL cursor, process the results in bulk, and perform the required update using a FORALL statement. Incidentally, the above query returns 999,000 rows.<\/p>\n<pre>SQL&gt; DECLARE\r\n2      CURSOR   c_new_salary_info IS\r\n3      SELECT   e.employee_id employee_id,\r\n4               MAX(nsi.salary) new_salary\r\n5      FROM     employees e,\r\n6               new_salary_info nsi\r\n7      WHERE    e.employee_id = nsi.employee_id\r\n8      GROUP BY e.employee_id;\r\n9\r\n10     TYPE t_new_salary_info IS TABLE OF c_new_salary_info%rowtype\r\n11      INDEX BY PLS_INTEGER;\r\n12     lt_new_salary_info t_new_salary_info;\r\n13\r\n14     v_rowcount PLS_INTEGER := 0;\r\n15   BEGIN\r\n16     OPEN c_new_salary_info;\r\n17\r\n18     LOOP\r\n19       FETCH c_new_salary_info BULK COLLECT\r\n20       INTO lt_new_salary_info LIMIT 1000;\r\n21\r\n22       EXIT WHEN lt_new_salary_info.COUNT = 0;\r\n23\r\n24       FORALL i IN 1..lt_new_salary_info.COUNT\r\n25         UPDATE employees\r\n26         SET employee_salary = lt_new_salary_info(i).new_salary\r\n27         WHERE employee_id   = lt_new_salary_info(i).employee_id;\r\n28\r\n29       v_rowcount := v_rowcount + SQL%ROWCOUNT;\r\n30      END LOOP;\r\n31\r\n32      CLOSE c_new_salary_info;\r\n33\r\n34      DBMS_OUTPUT.PUT_LINE(v_rowcount||' rows updated.');\r\n35 END;\r\n36 \/\r\n999000 rows updated.\r\n\r\nPL\/SQL procedure successfully completed.\r\n\r\nElapsed: 00:00:29.91\r\nSQL&gt;<\/pre>\n<p>In approximately 30 seconds we\u2019ve updated nearly one million rows with data from a four million row table. I\u2019d say that\u2019s pretty good, if you asked me.<\/p>\n<p>So why does this work so well? The first reason has already been noted, and that is the use of bulk operations and FORALL to reduce context switching between PL\/SQL and SQL. The second is how the cursor query is executed. The optimizer no longer has to worry about performing an update as part of the query and is able to simply join the two tables in a very efficient manner. We\u2019ve gone from a series of SQL update statements that would either have taken hours to complete or produced an Oracle error, to a small PL\/SQL block that does exactly we need in a very reasonable amount of time.<\/p>\n<p>To satisfy our curiosity however, let\u2019s see how the PL\/SQL block stacks up against the UPDATE and MERGE statements if we were allowed to add indexes to the NEW_SALARY_INFO table and if the EMPLOYEES table did not have a security policy against it.<\/p>\n<pre>SQL&gt; CREATE INDEX test_idx ON new_salary_info (employee_id);\r\n\r\nIndex created.\r\n\r\nElapsed: 00:00:06.04\r\nSQL&gt; BEGIN\r\n  2   DBMS_RLS.DROP_POLICY\r\n  3    (\r\n  4      object_name =&gt; 'EMPLOYEES',\r\n  5      policy_name =&gt; 'EMP_VIEW_POLICY'\r\n  6    );\r\n  7   END;\r\n  8   \/\r\n\r\nPL\/SQL procedure successfully completed.\r\n\r\nElapsed: 00:00:00.01\r\nSQL&gt;<\/pre>\n<p>Now let\u2019s look at the respective plans of our previous UPDATE and MERGE statements.<\/p>\n<pre>UPDATE employees e\r\nSET    employee_salary = (\r\n                          SELECT MAX(salary)\r\n                          FROM   new_salary_info nsi\r\n                          WHERE  nsi.employee_id = e.employee_id\r\n                         )\r\nWHERE EXISTS (\r\n              SELECT NULL\r\n              FROM   new_salary_info nsi2\r\n              WHERE  nsi2.employee_id = e.employee_id\r\n             );\r\n\r\nPLAN_TABLE_OUTPUT\r\n--------------------------------------------------------------------------------------------------------\r\n| Id  | Operation                     | Name            | Rows | Bytes |TempSpc| Cost (%CPU)| Time     |\r\n--------------------------------------------------------------------------------------------------------\r\n|   0 | UPDATE STATEMENT              |                 |  999K|    17M|       |  8007K (13)| 26:41:31 |\r\n|   1 |  UPDATE                       | EMPLOYEES       |      |       |       |            |          |\r\n|*  2 |   HASH JOIN SEMI              |                 |  999K|    17M|    23M|    7557 (1)| 00:01:31 |\r\n|   3 |    TABLE ACCESS FULL          | EMPLOYEES       | 1000K|    12M|       |     690 (2)| 00:00:09 |\r\n|   4 |    INDEX FAST FULL SCAN       | TEST_IDX        | 3996K|    19M|       |    2433 (1)| 00:00:30 |\r\n|   5 |   SORT AGGREGATE              |                 |    1 |    13 |       |            |          |\r\n|   6 |    TABLE ACCESS BY INDEX ROWID| NEW_SALARY_INFO |    4 |    52 |       |       7 (0)| 00:00:01 |\r\n|*  7 |     INDEX RANGE SCAN          | TEST_IDX        |    4 |       |       |       3 (0)| 00:00:01 |\r\n--------------------------------------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n\r\n   2 - access(\"NSI2\".\"EMPLOYEE_ID\"=\"E\".\"EMPLOYEE_ID\")\r\n   7 - access(\"NSI\".\"EMPLOYEE_ID\"=:B1)\r\n\r\nUPDATE\r\n (\r\n   SELECT e.employee_id,\r\n          e.employee_salary,\r\n          NVL((SELECT MAX(nsi.salary) FROM new_salary_info nsi WHERE e.employee_id =\r\nnsi.employee_id), e.employee_salary) AS new_sal\r\n   FROM employees e\r\n )\r\nSET employee_salary = new_sal;\r\n\r\nPLAN_TABLE_OUTPUT\r\n-----------------------------------------------------------------------------------------------\r\n| Id  | Operation                    | Name            | Rows | Bytes | Cost (%CPU)| Time     |\r\n-----------------------------------------------------------------------------------------------\r\n|   0 | UPDATE STATEMENT             |                 | 1000K|    12M|  8000K (13)| 26:40:09 |\r\n|   1 |  UPDATE                      | EMPLOYEES       |      |       |            |          |\r\n|   2 |   TABLE ACCESS FULL          | EMPLOYEES       | 1000K|    12M|     690 (2)| 00:00:09 |\r\n|   3 |   SORT AGGREGATE             |                 |    1 |    13 |            |          |\r\n|   4 |   TABLE ACCESS BY INDEX ROWID| NEW_SALARY_INFO |    4 |    52 |       7 (0)| 00:00:01 |\r\n|*  5 |    INDEX RANGE SCAN          | TEST_IDX        |    4 |       |       3 (0)| 00:00:01 |\r\n-----------------------------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n\r\n\u00a0  5 - access(\"NSI\".\"EMPLOYEE_ID\"=:B1)\r\n\r\nMERGE INTO employees e USING\r\n (\r\n   SELECT   e.rowid AS e_rowid,\r\n            MAX(nsi.salary) AS new_sal\r\n   FROM     employees e,\r\n            new_salary_info nsi\r\n   WHERE    e.employee_id = nsi.employee_id\r\n   GROUP BY e.rowid\r\n ) new_salaries\r\nON (e.rowid = new_salaries.e_rowid)\r\nWHEN MATCHED THEN UPDATE\r\n SET e.employee_salary = new_salaries.new_sal;\r\n\r\nPLAN_TABLE_OUTPUT\r\n------------------------------------------------------------------------------------------------------\r\n| Id  | Operation                   | Name            | Rows | Bytes |TempSpc| Cost (%CPU)|     Time |\r\n------------------------------------------------------------------------------------------------------\r\n|   0 | MERGE STATEMENT             |                 | 3991K|    79M|       |   14844 (2)| 00:02:59 |\r\n|   1 |  MERGE                      | EMPLOYEES       |      |       |       |            |          |\r\n|   2 |   VIEW                      |                 |      |       |       |            |          |\r\n|   3 |    MERGE JOIN               |                 | 3991K|   144M|       |   14844 (2)| 00:02:59 |\r\n|   4 |     SORT JOIN               |                 | 3991K|    95M|       |    9392 (3)| 00:01:53 |\r\n|   5 |      VIEW                   |                 | 3991K|    95M|       |    9392 (3)| 00:01:53 |\r\n|   6 |       SORT GROUP BY         |                 | 3991K|    98M|       |    9392 (3)| 00:01:53 |\r\n|*  7 |        HASH JOIN            |                 | 3991K|    98M|    23M|    9228 (1)| 00:01:51 |\r\n|   8 |         INDEX FAST FULL SCAN| SYS_C008302     | 1000K|    12M|       |     516 (2)| 00:00:07 |\r\n|   9 |         TABLE ACCESS FULL   | NEW_SALARY_INFO | 3996K|    49M|       |    2767 (2)| 00:00:34 |\r\n|* 10 |      SORT JOIN              |                 | 1000K|    12M|    45M|    5452 (1)| 00:01:06 |\r\n|  11 |       TABLE ACCESS FULL     | EMPLOYEES       | 1000K|    12M|       |     690 (2)| 00:00:09 |\r\n------------------------------------------------------------------------------------------------------\r\n\r\nPredicate Information (identified by operation id):\r\n---------------------------------------------------\r\n\r\n   7 - access(\"E\".\"EMPLOYEE_ID\"=\"NSI\".\"EMPLOYEE_ID\")\r\n  10 - access(\"E\".ROWID=\"NEW_SALARIES\".\"E_ROWID\")\r\n       filter(\"E\".ROWID=\"NEW_SALARIES\".\"E_ROWID\")<\/pre>\n<p>Those look much better. The optimizer is making use of the new index when it should, and the MERGE statement is now permissible. The following table summarizes the average of 10 executions of each of the UPDATE statements, the MERGE statement, and the PL\/SQL block. The results are rather interesting:<\/p>\n<table style=\"width: 563px; height: 130px; border color: #000000;\" border=\"1\">\n<tbody>\n<tr>\n<td width=\"100\"><\/td>\n<td>Update<\/td>\n<td>Update View<\/td>\n<td>Merge<\/td>\n<td>PL\/SQL<\/td>\n<\/tr>\n<tr>\n<td>Average time (seconds) over 10 executions<\/td>\n<td>56.753<\/td>\n<td>40.924<\/td>\n<td>27.013<\/td>\n<td>31.108<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The PL\/SQL block out-performed both of the updates and was nearly on par with the MERGE statement, even in an ideal world.<\/p>\n<p>What does this teach us? I think it is a reminder that we should be both aware of and willing to explore all of our options as developers. PL\/SQL has the ability to work very efficiently for you if you understand what is available within the language. It could be your best option.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>PL\/SQL and SQL both are incredibly powerful languages that allow us to do some pretty amazing things with data. If you\u2019ve been around the Oracle community for some time you\u2019ve probably realized, either through your own experience or by reading articles, if you are able to accomplish a task using a single SQL statement then that is preferred over using&hellip;<\/p>\n","protected":false},"author":316175,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[],"coauthors":[],"class_list":["post-73388","post","type-post","status-publish","format-standard","hentry","category-oracle-databases"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73388","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/316175"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=73388"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73388\/revisions"}],"predecessor-version":[{"id":91818,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73388\/revisions\/91818"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=73388"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=73388"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=73388"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=73388"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}