{"id":80543,"date":"2018-08-30T13:27:36","date_gmt":"2018-08-30T13:27:36","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=80543"},"modified":"2022-04-24T15:36:49","modified_gmt":"2022-04-24T15:36:49","slug":"10-best-practices-for-writing-oracle-sql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/10-best-practices-for-writing-oracle-sql\/","title":{"rendered":"10 Best Practices for Writing Oracle SQL"},"content":{"rendered":"<p>Writing efficient and high-quality SQL is hard to do. Sometimes it comes down to trialling different types of queries to get one that gives you the results you want and has good performance. There are a range of \u2018best practices\u2019 or tips that are recommended for working with SQL. Many of them relate to SQL overall, and some of them are specific to Oracle SQL. In this article, I\u2019ll explain ten of the Oracle SQL best practices to help improve your SQL queries.<\/p>\n<h2>1. Use ANSI Joins Instead of Oracle Joins<\/h2>\n<p>In Oracle SQL, there are two ways to join tables. You might be familiar with the ANSI method, which involves using <strong>JOIN<\/strong> keywords between tables:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT emp.*, dept.*\r\nFROM emp\r\nINNER JOIN dept ON emp.dept_id = dept.id;<\/pre>\n<p>You can also do outer joins such as a LEFT JOIN:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT emp.*, dept.*\r\nFROM emp\r\nLEFT JOIN dept ON emp.dept_id = dept.id;<\/pre>\n<p>There is another method which is occasionally referred to as an Oracle join, because the syntax is Oracle-specific. An inner join is done using the <strong>WHERE<\/strong> clause:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT emp.*, dept.*\r\nFROM emp, dept\r\nWHERE emp.dept_id = dept.id;<\/pre>\n<p>An outer join is done by placing a <strong>(+)<\/strong> on the <strong>WHERE<\/strong> clause after the column that <strong>NULL<\/strong>s are allowed. For example, a <strong>LEFT JOIN<\/strong> can be written as:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT emp.*, dept.*\r\nFROM emp, dept\r\nWHERE emp.dept_id = dept.id(+);<\/pre>\n<p>A <strong>RIGHT JOIN<\/strong> can be written by putting the symbol on the other side of the join:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT emp.*, dept.*\r\nFROM emp, dept\r\nWHERE emp.dept_id(+) = dept.id;<\/pre>\n<p>The recommendation with writing joins is to use the ANSI style (the <strong>JOIN <\/strong>and <strong>ON<\/strong> keywords) rather than the Oracle style (the <strong>WHERE<\/strong> clause with <strong>(+)<\/strong> symbols). I\u2019ve written about this before in my guide to <a href=\"https:\/\/www.databasestar.com\/sql-joins\/\">joins in Oracle<\/a>, and there are a few reasons for this:<\/p>\n<ul>\n<li>In large queries, it\u2019s easy to forget to add a <strong>WHERE<\/strong> clause to join a table, causing unnecessary cartesian joins and incorrect results<\/li>\n<li>The <strong>WHERE<\/strong> clause should be used for filtering records, not for joining tables together. A subtle difference, but it makes the query easier to understand<\/li>\n<li>ANSI joins are arguably easier to read, as you can see which section is used for joins and which is used for filtering data.<\/li>\n<\/ul>\n<h2>2. Avoid WHERE Clauses with Functions<\/h2>\n<p>Another recommendation for working with Oracle SQL is to avoid writing WHERE clauses that use functions. In SQL, <strong>WHERE<\/strong> clauses are used to filter the rows to be displayed. These are often used to check that a column equals a certain value:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">WHERE status = \u2018A\u2019<\/pre>\n<p>You may have a need to compare a column to a value that has used a function. For example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">WHERE UPPER(last_name) = \u2018SMITH\u2019<\/pre>\n<p>Another example could be:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">WHERE ROUND(monthly_salary) &gt; 2000<\/pre>\n<p>Using functions on columns in the <strong>WHERE<\/strong> clause should be avoided. This is because any indexes that are created on the columns themselves (e.g. <em>last_name <\/em>or <em>monthly_salary<\/em>) will not be used if a function is applied in the query, which can slow the query down a lot.<\/p>\n<p>To avoid using a function on a column, consider if there\u2019s a way to write the <strong>WHERE<\/strong> clause without the function. Sometimes there is, but other times you need to write the function.<\/p>\n<p>If you do need to have the function on the column in the <strong>WHERE<\/strong> clause, consider creating a function-based index on the column. This is a type of index that is created on the result of a function applied to the column, which could be used in this query.<\/p>\n<h2>3. Use CASE Instead of Multiple Unions<\/h2>\n<p>I\u2019ve seen several examples of queries that are looking up a range of records based on criteria. The criteria are more than just a simple <strong>WHERE<\/strong> clause, and depending on different types of records, the joins and other criteria might be different.<\/p>\n<p>This is often implemented as several <strong>SELECT<\/strong> queries joined together using <strong>UNION<\/strong> or <strong>UNION ALL<\/strong> keywords. For example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT id, product_name\r\nFROM product\r\nWHERE status = \u2018X\u2019 AND created_date &lt; TO_DATE(\u20182017-01-01\u2019, \u2018YYYY-MM-DD\u2019)\r\nUNION ALL\r\nSELECT id, product_name\r\nFROM product\r\nWHERE status = \u2018A\u2019 AND product_series = \u2018WXT\u2019;<\/pre>\n<p>This is a simple example, but often the different queries may include joins or lookups to other tables.<\/p>\n<p>Structuring a query like this means that the tables need to be queried several times (once for each <strong>SELECT<\/strong> query), which is quite inefficient. There is a chance that your table will have an index on it to make it run more efficiently, but there is another method that\u2019s worth trying. Rather than having separate queries with <strong>UNION ALL<\/strong>, try putting the logic inside a <strong>CASE <\/strong>statement inside a single <strong>SELECT<\/strong>:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">SELECT id, product_name\r\nFROM (\r\nSELECT id, product_name,\r\nCASE\r\nWHEN status = \u2018X\u2019 AND created_date &lt; \r\n         TO_DATE(\u20182017-01-01\u2019, \u2018YYYY-MM-DD\u2019) THEN 1\r\nWHEN status = \u2018A\u2019 AND product_series = \u2018WXT\u2019 THEN 1\r\nELSE 0 END AS prodcheck\r\nFROM product\r\n) sub\r\nWHERE prodcheck = 1;<\/pre>\n<p>This query would only run once on the <em>product<\/em> table and will show the same results as separate <em>SELECT<\/em> queries with a <em>UNION ALL<\/em>.<\/p>\n<p>The logic to show the right records is in the <em>CASE<\/em> statement. There are several lines, one for each set of criteria, and it returns a 1 if a match is found. This logic is all inside a subquery, and the outer query filters to show only those records where that <strong>CASE<\/strong> is 1.<\/p>\n<p>There are a few different ways to write the <strong>CASE<\/strong> statement, but the idea is to only have the main query and several criteria in the <strong>CASE<\/strong> statement, rather than separate queries. However, make sure you test both versions of the query for performance, as there may be indexes that are used with the <strong>UNION<\/strong> query that don\u2019t run with the <strong>CASE<\/strong> query.<\/p>\n<h2>4. Minimise the Use of DISTINCT<\/h2>\n<p>The <strong>DISTINCT<\/strong> keyword in SQL allows you to return unique records in the result set by eliminating duplicate results. This seems simple, and it\u2019s a useful command. Using <strong>DISTINCT<\/strong> is OK in many cases, however, it can be a symptom of a different issue. If your result set is displaying data from many different tables, you might end up getting some duplicate results. I\u2019ve seen this many times in my queries.<\/p>\n<p>It can be tempting to add a <strong>DISTINCT<\/strong> keyword to ensure you don\u2019t get duplicate records. But adding a <strong>DISTINCT<\/strong> keyword will likely cause an expensive operation to be performed on your query, slowing it down. It will give you the results you need, but it\u2019s masking a problem elsewhere. It could be from an incomplete <strong>JOIN<\/strong>, or incorrect data in a table, or some criteria you aren\u2019t considering, which is causing the duplicate row. Fixing the issue in your query or in your data is the right solution.<\/p>\n<h2>5. Redesign Data Value Lists to Use Tables<\/h2>\n<p>Occasionally you may need to write queries that use several values as criteria. This is often done as a <strong>WHERE<\/strong> clause and an <strong>IN<\/strong> keyword:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT *\r\nFROM product\r\nWHERE status IN (\u2018A\u2019, \u2018P\u2019, \u2018C\u2019, \u2018S\u2019);<\/pre>\n<p>This query might give you the results you want. What would happen if the status values change at some point in the future, or the business rules change which means you need to adjust this list.<\/p>\n<p>If this list is coded into your query, you\u2019ll need to adjust your query. This may result in change in application code and a deployment process.<\/p>\n<p>Another way to do this is to store the values in a separate table and join to this table. For example, you could have a <em>status_lookup<\/em> table which has values and categories in it, where the category defines the data you need.<\/p>\n<p>Your query could then be something like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT product.*\r\nFROM product\r\nINNER JOIN status_lookup ON product.status = status_lookup.status\r\nWHERE status_lookup.category = \u2018ACTIVE\u2019;<\/pre>\n<p>This way, whenever the business rules change, all you need to do is update the data in your <em>status_lookup<\/em> table, and no code changes are required. This recommendation was also suggested in the article on <a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/sql-code-smells\/\">SQL Code Smells<\/a>.<\/p>\n<h2>6. UNION ALL instead of UNION<\/h2>\n<p>There are two similar keywords in SQL that are used to combine results: <strong>UNION<\/strong> and <strong>UNION ALL<\/strong>. They are called \u2018set operators\u2019, as they work with result sets.<\/p>\n<p>There are some minor differences between them. <strong>UNION ALL<\/strong> shows all records in both result sets, and <strong>UNION<\/strong> shows all records excluding duplicates.<\/p>\n<p>Just to be clear, <strong>UNION<\/strong> removes duplicates and <strong>UNION ALL<\/strong> does not.<\/p>\n<p>This means, in Oracle, that an extra step is performed when using a <strong>UNION <\/strong>to remove all duplicate rows from the result set after it is combined. It\u2019s the same as performing a <strong>DISTINCT<\/strong>.<\/p>\n<p>If you really need duplicates removed, then use <strong>UNION<\/strong>. But, if you only want to combine values and don\u2019t care about duplicates, or want to see all values, then use <strong>UNION ALL<\/strong>. Depending on your query, it will give you the same results and also perform better as there is no duplicate removal.<\/p>\n<h2>7. Use Table Aliases<\/h2>\n<p>A great way to improve your queries is to use table aliases. Table aliases are names you can give to tables in your queries, to make them easier to write and work with. For example, using our earlier query on product and <em>status_lookup<\/em> tables, this is what it looks like without a table alias:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT product.*\r\nFROM product\r\nINNER JOIN status_lookup ON product.status = status_lookup.status\r\nWHERE status_lookup.category = \u2018ACTIVE\u2019;<\/pre>\n<p>You can add table aliases by specifying a name after the table name. These table aliases are usually short (one or a few characters), and are usually an abbreviation for the full table name:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT p.*\r\nFROM product p\r\nINNER JOIN status_lookup s ON p.status = s.status\r\nWHERE s.category = \u2018ACTIVE\u2019;<\/pre>\n<p>The table alias of <strong>p<\/strong> for <em>product<\/em> and <strong>s<\/strong> for <em>status_lookup<\/em> are included with the tables in the query. Then, anytime you refer to those tables (in the <strong>SELECT<\/strong> clause, the <strong>JOIN<\/strong>, or the <strong>WHERE<\/strong> clause), you can use the table alias. It makes it easier to read and write.<\/p>\n<p>Also, <strong>p<\/strong> and <strong>s<\/strong> were deliberately chosen as they are abbreviations for the full table name. This is a good practice to use, especially when working on larger queries, rather than using generic letters such as <strong>a<\/strong> or <strong>b<\/strong>. It\u2019s much easier to tell which table a field comes from if you use a descriptive alias.<\/p>\n<h2>8. Only Use HAVING on Aggregate Functions<\/h2>\n<p>The <strong>HAVING<\/strong> clause in Oracle SQL is used to filter records from your result set. It\u2019s very similar to the <strong>WHERE<\/strong> clause. However, the <strong>WHERE<\/strong> clause filters rows before the aggregate functions are applied, and the <strong>HAVING<\/strong> clause filters rows after the aggregate functions are applied. It can be tempting to use <strong>HAVING<\/strong> for everything if you\u2019re using an aggregate function, but they do different things in your query.<\/p>\n<p>For example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT status, COUNT(*)\r\nFROM product\r\nWHERE status IS NOT NULL\r\nGROUP BY status\r\nHAVING COUNT(*) &gt; 1;<\/pre>\n<p>This will find the count of each <em>product status<\/em> that is not <strong>NULL<\/strong> where there is more than one record for the <em>status<\/em>, which is likely what you want. If you write the query using only the <strong>HAVING<\/strong> clause, it would look like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT status, COUNT(*)\r\nFROM product\r\nGROUP BY status\r\nHAVING status IS NOT NULL\r\nAND COUNT(*) &gt; 1;<\/pre>\n<p>This may give you different results, depending on your data. It may also perform worse, as it needs to aggregate all of the data before removing it using the <strong>HAVING<\/strong> clause. It also implies a different set of rules.<\/p>\n<p>Be sure to only use <strong>HAVING<\/strong> on aggregate functions and use <strong>WHERE<\/strong> on results you want to restrict before the aggregate.<\/p>\n<h2>9. Always Specify Columns in INSERT Statements<\/h2>\n<p>The <strong>INSERT <\/strong>statement in Oracle SQL has an optional component where you specify the columns to insert data into:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">INSERT INTO tablename (col1, col2\u2026 col_n)\r\nVALUES (val1, val2\u2026 val_n);<\/pre>\n<p>The part of the <strong>INSERT<\/strong> statement with the columns is the optional part. An <strong>INSERT<\/strong> statement without the columns will still work:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">INSERT INTO product VALUES (1, \u2018Large Chair\u2019, 120.00);<\/pre>\n<p>However, a good habit to get into is to specify the columns in an <strong>INSERT<\/strong> statement. This has several benefits. First, it can prevent errors or data going into the wrong column. Without specifying the columns, there\u2019s no guarantee which order the columns will be inserted into. This can cause errors to appear, or the data to be inserted with values in the wrong columns.<\/p>\n<p>It\u2019s also clear which columns represent which values. When you look at a statement without the columns, you\u2019ll have to guess what the values are. If you add the columns, you know exactly which values are for each column.<\/p>\n<p>Be sure to include the columns in your <strong>INSERT<\/strong> statement.<\/p>\n<h2>10. Avoid Object Names with Spaces<\/h2>\n<p>The final best practice I\u2019ll recommend is to avoid using spaces in your object names. Many examples of SQL online specify object names (such as tables) that include spaces. Most of these examples are for Microsoft Access or SQL Server and include either square brackets or quotes around table names:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT id, category_name\r\nFROM \u201cProduct Category\u201d;<\/pre>\n<p>Using a table name with a space in it might be easier to read. However, it can cause several issues. Table names in Oracle are stored in upper case, or if they have quotes, they are stored as you enter them. This means whenever you refer to this table, you\u2019ll need to use quotes and specify it as it was written. It\u2019s inconvenient for you and for other developers.<\/p>\n<p>Another reason is that it\u2019s harder to refer to this table in queries. You\u2019ll have to specify it with quotes, and probably need to use a table alias to ensure your queries are correct.<\/p>\n<p>It\u2019s much better to specify the object names without spaces. You can use underscores instead:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT id, category_name\r\nFROM product_category;<\/pre>\n<p>You should follow your team\u2019s naming conventions, which would include tables and other objects, which has been <a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/oracle\/validating-naming-conventions-oracle\/\">written about here<\/a>.<\/p>\n<p>From Oracle 12c, the maximum length for an object name was increased from 30 characters to 32,000 characters. This means you\u2019ll have a lot more room to come up with a great name for a table. This doesn\u2019t mean you should be excessive, but just choose a name that represents what you are creating, without using spaces.<\/p>\n<h2>Summary<\/h2>\n<p>So, there are my top 10 best practices for working with Oracle SQL. Some of them are applicable to all types of SQL, but many of them are Oracle specific.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There is often more than one way to write a query that returns the same results, but some methods may perform better than others. In this article, Ben Brumm discusses ten best practice when writing SQL code for Oracle. &hellip;<\/p>\n","protected":false},"author":317322,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[124952],"coauthors":[50619],"class_list":["post-80543","post","type-post","status-publish","format-standard","hentry","category-oracle-databases","tag-redgate-deploy"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/80543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/317322"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=80543"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/80543\/revisions"}],"predecessor-version":[{"id":83356,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/80543\/revisions\/83356"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=80543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=80543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=80543"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=80543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}