{"id":73183,"date":"2014-10-16T10:35:38","date_gmt":"2014-10-16T10:35:38","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/external-tables-an-introduction\/"},"modified":"2021-07-14T13:07:29","modified_gmt":"2021-07-14T13:07:29","slug":"external-tables-an-introduction","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/external-tables-an-introduction\/","title":{"rendered":"External Tables: An Introduction"},"content":{"rendered":"<p>At its most basic level, Oracle databases are built around data based within tables held within the database, and the moulding of that data into information and meaning. Under this model, data is created within the database or imported into it.<\/p>\n<p>There is, however, a third way. Using external tables, Oracle can query the contents of flat files held <i>outside<\/i> the database, in pretty much the same way that you would query a more normal, vanilla, database table.<\/p>\n<p>Our sales department, for example, use a non-Oracle application. At the close of each day they export the day\u2019s sales into a csv file which they save in an agreed location. This means that all I need do is create an external table which points at this file, and I am then able to query the data and use it to populate tables in my Oracle database. Using an external table makes this process pretty much painless; in the old days we needed a morass of scripts, data entry and loaders to achieve the same thing.<\/p>\n<h4>Restrictions<\/h4>\n<p>There are, as you would expect, some restrictions. External tables are powered by the ORACLE_LOADER access driver, and therefore can only read data stored in formats that can be loaded using SQL*Loader. Most of the files I receive are comma-separated and that suits me just fine, but with a little cleverness, we can even use XML.<\/p>\n<p>Secondly, using external tables you cannot perform DML on your flat files; no deletes, no updates. And frankly, you should not want to. Think of yourself as a Hollywood-style bank robber: go in, grab the data, get out. In other words: query the external table, use that data to populate other tables, and then leave it alone. Don\u2019t start making it an ordinary working part of your application; that\u2019s just crazy.<\/p>\n<p>There is a good reason for this, and that brings us to our third restriction: you cannot index external tables. This means, of course, that you should not expect an external table to perform at the speed of an ordinary table. So be a bank robber: go in, grab the data to populate other tables, and leave.<\/p>\n<p>Finally, and this isn\u2019t exactly a restriction per se, the flat file that the external table is built upon must be located somewhere accessible to Oracle. As you know, directories are Oracle\u2019s means of mapping OS paths. So ensure your flat file is located in a folder that can be reached by Oracle directory, and that all the read privileges are present and correct.<\/p>\n<h4>Creating an External Table<\/h4>\n<p>To create an external table we use the following syntax: CREATE TABLE \u2026 ORGANIZATION EXTERNAL. The CREATE TABLE section of the statement is identical to the usual syntax that you would use to create a table: and the ORGANIZATION EXTERNAL section is made up of the access parameters (also known as the opaque format spec) and information about the location of the source file.<\/p>\n<p>Let\u2019s shine some light on the opaque format spec. The access parameters are made up of record format information, field definitions and any column transformations.<\/p>\n<p>The <strong>record_format_info<\/strong> clause is optional. When present it must begin with the RECORDS keyword.<\/p>\n<p>These are some of the subclauses you may wish to use:<\/p>\n<ul>\n<li>DELIMITED BY: This subclause expresses how the fields are delimited and terminated.<\/li>\n<li>LOAD WHEN: Using this subclause we can supply a conditional statement by which the decision to load or not load a record is made.<\/li>\n<li>BADFILE: Specifies a file (including a valid directory) to which erroring records should be written. If a record does not pass the LOAD WHEN condition it is not written to the bad file.\u00a0 Instead there is a subclause, DISCARDFILE, that specifies where such records should be logged.<\/li>\n<li>LOGFILE: A valid Oracle directory and file name for all logging.<\/li>\n<li>SKIP: This specifies a number of records that should be skipped before loading begins. If, for example, your csv begins with a header record, you will want to skip the first record.<\/li>\n<\/ul>\n<p>The <strong>field_definitions<\/strong> clause details the fields in the data file and outlines how they build up the record. This clause must begin with the FIELDS keyword.<\/p>\n<ul>\n<li>ENCLOSED BY: Shows the string(s) that encapsulates each field.<\/li>\n<li>TERMINATED BY: Specifies the string that terminates the fields or the keyword WHITESPACE.<\/li>\n<\/ul>\n<p>The column transformations outline any transformations that should be applied to each column. Transformations are specified column-by-column.<\/p>\n<ul>\n<li>NULL: If you apply the NULL transform, the column will be set to null irrespective of what is in the data file.<\/li>\n<li>CONSTANT: With the CONSTANT transform, the column will be set to the specified string for all records.<\/li>\n<\/ul>\n<h4>Example<\/h4>\n<p>An example will probably help you make some sense out of my huge spewing of subclauses.<\/p>\n<p>Let\u2019s assume the Sales team create a daily spreadsheet of sales details, and we need to build an external table to query it.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5260\" alt=\"csv\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/csv.jpg\" width=\"405\" height=\"272\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/csv.jpg 405w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/csv-300x201.jpg 300w\" sizes=\"auto, (max-width: 405px) 100vw, 405px\" \/><\/p>\n<pre>CREATE TABLE ext_daily_sales (\r\n   Sale_time\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 DATE,\r\n   Sale_item\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 VARCHAR2(100),\r\n   Quantity\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0NUMBER\r\n)\r\nORGANIZATION EXTERNAL\r\n(TYPE ORACLE_LOADER\r\n   DEFAULT DIRECTORY sales_dir\r\n   ACCESS PARAMETERS (\r\n      RECORDS DELIMITED BY NEWLINE\r\n         LOAD WHEN (quantity &gt; 1)\r\n         BADFILE sales_dir:'dailysales.bad'\r\n         LOGFILE sales_dir:'dailysales.log'\r\n         DISCARDFILE sales_dir:'dailysales-discarded.log'\r\n         SKIP 1\r\n      FIELDS delimited by ','\r\n        (sale_time char(20) DATE_FORMAT DATE mask \"dd\/mm\/yyyy hh24:mi:ss\",\r\n         Sale_item,\r\n         Quantity)\r\n   )\r\n   LOCATION ('daily sales report.csv')\r\n);<\/pre>\n<p>There are a few things to note. The RECORDS and FIELDS keywords must be present \u2013 unless you completely exclude the record_format_info and\/or field_definition clauses. Secondly, your file names \u2013 as used, for example, in the badfile and logfile subclauses \u2013 should contain your Oracle directory name concatenated to your file name by means of a colon. If you omit the directory name, the file will be created in your DEFAULT DIRECTORY.<\/p>\n<h4>XML<\/h4>\n<p>At first glance there seems to be very little leeway at the heart of the external table functionality; however, with just a little cunning, we can use it to read XML files.<\/p>\n<p>Assume the Daily Sales Total report that we receive is an XML file.\u00a0 We could create an external table upon it.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5261\" alt=\"xml\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/xml.jpg\" width=\"407\" height=\"403\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/xml.jpg 407w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/xml-150x150.jpg 150w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/xml-300x297.jpg 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/oracle\/2014\/10\/xml-60x60.jpg 60w\" sizes=\"auto, (max-width: 407px) 100vw, 407px\" \/><\/p>\n<pre>CREATE TABLE ext_sales_totals (\r\n   Sale_date\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 DATE,\r\n   Sale_item\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 VARCHAR2(100),\r\n   Quantity\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0NUMBER\r\n)\r\nORGANIZATION EXTERNAL\r\n(TYPE ORACLE_LOADER\r\n   DEFAULT DIRECTORY sales_dir\r\n   ACCESS PARAMETERS (\r\n      RECORDS DELIMITED BY '&lt;\/Item&gt;'\r\n         LOGFILE 'sales_totals.log'\r\n      FIELDS (\r\n         Sale_date   CONSTANT SYSDATE,\r\n         Sale_item\u00a0  CHAR(100) ENCLOSED BY '&lt;Name&gt;' AND '&lt;\/Name&gt;',\r\n         Quantity    CHAR(5) ENCLOSED BY '&lt;Quantity&gt;' AND '&lt;\/Quantity&gt;')\r\n  )\r\n  LOCATION ('sales totals.xml')\r\n);<\/pre>\n<h4>Modifying External Tables<\/h4>\n<p>The sands can shift beneath the feet of an external table quite easily.\u00a0 At the close of each business day, the sales team create new versions of <i>daily sales report.csv<\/i> and <i>sales totals.xml<\/i> and overwrite the old copies of these files.\u00a0 That way, when we rerun our scripts the next morning, we have a whole new suite of data to upload.<\/p>\n<p>If, however, you need to make structural &#8211; DDL &#8211; amendments to your external table, this is also possible.\u00a0 If, for example, your default directory or your file name changes, you can issue the following alter statements.<\/p>\n<pre>ALTER TABLE ext_daily_sales\r\nDEFAULT DIRECTORY new_sales_dir;<\/pre>\n<pre>ALTER TABLE ext_sales_totals\r\nLOCATION ('new sales totals.xml');<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>At its most basic level, Oracle databases are built around data based within tables held within the database, and the moulding of that data into information and meaning. Under this model, data is created within the database or imported into it. There is, however, a third way. Using external tables, Oracle can query the contents of flat files held outsid&hellip;<\/p>\n","protected":false},"author":221907,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[48408,4217],"coauthors":[],"class_list":["post-73183","post","type-post","status-publish","format-standard","hentry","category-oracle-databases","tag-external-tables","tag-xml"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73183","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221907"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=73183"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73183\/revisions"}],"predecessor-version":[{"id":91671,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73183\/revisions\/91671"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=73183"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=73183"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=73183"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=73183"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}