{"id":96677,"date":"2023-05-04T19:56:16","date_gmt":"2023-05-04T19:56:16","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=96677"},"modified":"2023-07-19T14:58:35","modified_gmt":"2023-07-19T14:58:35","slug":"importing-data-into-a-mysql-database-using-load-data","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/mysql\/importing-data-into-a-mysql-database-using-load-data\/","title":{"rendered":"Importing data into a MySQL database using LOAD DATA"},"content":{"rendered":"<p style=\"margin: 0in; font-family: Calibri; font-size: 11.0pt;\"><p><strong>This article is part of Robert Sheldon's continuing series on Learning MySQL. To see all of the items in the series, <a href=\"https:\/\/www.red-gate.com\/simple-talk\/mysql-coding-basics\/\">click here<\/a>.<\/strong><\/p>\n<\/p>\n<p>Database and development teams often load data from plain text files into their MySQL databases. The files might be used to add lookup data, support test and development environments, populate new MySQL instances, load data from regular feeds, or in other ways support their operations. To help with the import process, MySQL provides the <code>LOAD<\/code> <code>DATA<\/code> statement, which reads rows from a text file and inserts them into the target table.<\/p>\n<p>In this article, I show how to use the <code>LOAD<\/code> <code>DATA<\/code> statement to add data from comma-separated values (CSV) files and other plain text files. Although the examples are fairly basic, they demonstrate the fundamental components that go into a <code>LOAD<\/code> <code>DATA<\/code> statement and some of the issues you might run up against along the way. Each example retrieves data from a file on the local system and adds the data to the <code>manufacturers<\/code> table in the <code>travel<\/code> database, which you\u2019ve seen in previous articles in this series.<\/p>\n<p>Note: The examples in this article are based on a local instance of MySQL that hosts a very simple database and table. The last section of the article\u2014\u201cAppendix: Preparing your MySQL environment\u201d\u2014provides information about how I set up my system and includes a SQL script for creating the database and table on which the examples are based.<\/p>\n<h2>Connecting to the MySQL server<\/h2>\n<p>Importing data from a text file into a MySQL database is in itself a fairly straightforward process. Often the most difficult part of the operation is setting up your environment to ensure that it will allow you to run a <code>LOAD<\/code> <code>DATA<\/code> statement and import the data into the target table. As with any SQL statement in MySQL, you must have been granted the privileges necessary to carry out your operations (a topic beyond the scope of this article). However, there are a few other issues to be aware of in order to import data, starting with the <code>LOCAL<\/code> option.<\/p>\n<p>When you create a <code>LOAD<\/code> <code>DATA<\/code> statement, you can include the <code>LOCAL<\/code> option as part of the statement definition. The option determines the statement\u2019s security requirements as well as whether the source text file is located on the client system or the server hosting the MySQL instance:<\/p>\n<ul>\n<li>If you do not specify the <code>LOCAL<\/code> option, the source text file must be located on the MySQL host. When you run the <code>LOAD<\/code> <code>DATA<\/code> statement, MySQL reads the file directly from the directory and inserts the data into the target table. This approach generally performs a little better than when you include the <code>LOCAL<\/code> option because the data is loaded more directly. However, getting the connection right is much more difficult (and the topic for many online discussions).<\/li>\n<li>If you specify the <code>LOCAL<\/code> option, the source text file must be located on the client machine. The client reads the file and sends the content to the server, where it is stored in a temporary file until it is loaded into the target table for processing. The <code>LOCAL<\/code> option also works if the client and MySQL are on the same machine, which is the approach I\u2019ve taken for this article. Connectivity is typically much easier to establish when you use the <code>LOCAL<\/code> option.<\/li>\n<\/ul>\n<p>For the examples in this article, I have used the <code>LOCAL<\/code> option. Not only are the MySQL connectivity requirements more complicated without it, but they are also not well documented, adding to the frustration should you run into any glitches. If you check out the various forum postings that discuss connectivity issues around the <code>LOAD<\/code> <code>DATA<\/code> statement, you\u2019ll find that in many cases, people responding to a post suggest the use of the <code>LOCAL<\/code> option as a simple workaround to the various challenges.<\/p>\n<p>I also think that for many database administrators and developers, locating source files on the client side is preferable to uploading those files to the MySQL server, if they\u2019re even permitted to do so. If you use the <code>LOCAL<\/code> option, you do not need <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/privileges-provided.html#priv_file\">the FILE privilege<\/a> to run a <code>LOAD<\/code> <code>DATA<\/code> statement, and you can store the source text file in any local folder that can be accessed by the client application, which in this case, is MySQL Workbench.<\/p>\n<p>Note: The MySQL documentation states that \u201cif <code>LOCAL<\/code> is specified, the file must be located on the client host.\u201d However, I was able to run a <code>LOAD<\/code> <code>DATA<\/code> statement that included the <code>LOCAL<\/code> option and that pulled data from other systems on my network. The first was another Mac computer and the other was Windows 11 virtual machine. I have not tested this capability beyond that.<\/p>\n<p>When using the <code>LOCAL<\/code> option, you must ensure that data-loading is enabled on both the client side and server side. To enable it on the client side in Workbench, you should modify your connection on the tool\u2019s home screen. In the main window, right-click the connection and click <strong>Edit connection<\/strong>. On the <strong>Connection<\/strong> page of the <strong>Manage Server Connections<\/strong> dialog box, select the <strong>Advanced<\/strong> tab and add the following command in the <strong>Others<\/strong> box:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">OPT_LOCAL_INFILE=1<\/pre>\n<p>The command sets the <code>local-infile<\/code> option to <code>ON<\/code>, making it possible to run a <code>LOAD<\/code> <code>DATA<\/code> statement that includes the <code>LOCAL<\/code> option. The following figure shows the setting (outlined in red) as it appears on the connection\u2019s <strong>Advanced<\/strong> tab. This setting applies only to this user\u2019s connections in Workbench. Other connections must be configured individually.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1854\" height=\"1256\" class=\"wp-image-96678\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/04\/word-image-96677-1.png\" \/><\/p>\n<p>In addition to enabling the <code>local-infile<\/code> option, you must also enable the <code>local_infile<\/code> global variable on the server, if it\u2019s not already enabled. (The only difference between these two names is that the global variable uses an underscore rather than a dash.) To confirm the variable\u2019s setting, you can run a <code>SHOW<\/code> <code>GLOBAL<\/code> <code>VARIABLES<\/code> statement against your MySQL instance:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SHOW GLOBAL VARIABLES LIKE 'local_infile';<\/pre>\n<p>If the statement returns a value of <code>ON<\/code>, then you\u2019re all set. If the statement returns <code>OFF<\/code>, then you should run the following <code>SET<\/code> statement to enable the variable:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SET GLOBAL local_infile = 1; <\/pre>\n<p>Once you\u2019ve enabled local data-loading on both the client and server, you should be ready to run your <code>LOAD<\/code> <code>DATA<\/code> statements. The examples that follow demonstrate different aspects of importing data from a text file. I\u2019ll show you the contents of each file as we work through the examples. You can then create them on your own system if you want to try out the examples for yourself.<\/p>\n<h2>Introducing the <strong>LOAD DATA<\/strong> statement<\/h2>\n<p>Before we get into the first example, it\u2019s important to understand the basic components that go into a <code>LOAD<\/code> <code>DATA<\/code> statement, which includes a number of clauses and subclauses. The following syntax simplifies the statement somewhat to give you an overview of the statement\u2019s essential elements and how they fit together:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA [LOCAL] \r\nINFILE 'file_name'\r\n[REPLACE | IGNORE]\r\nINTO TABLE table_name\r\nFIELDS\r\n  [TERMINATED BY 'string']\r\n  [[OPTIONALLY] ENCLOSED BY 'char']\r\n  [ESCAPED BY 'char']\r\nLINES\r\n  [STARTING BY 'string']\r\n  [TERMINATED BY 'string']\r\nIGNORE n LINES\r\n[(column_list)]<\/pre>\n<p>The <code>LOAD<\/code> <code>DATA<\/code> clause is where you specify whether to include the <code>LOCAL<\/code> option. As I mentioned earlier, this is the approach I\u2019ve taken in this article. The next clause, <code>INFILE<\/code>, specifies the path and filename (in quotes) of the source text file. You can provide an absolute path or relative path. If relative, the path is relative to the invocation directory.<\/p>\n<p>You can then specify either <code>REPLACE<\/code> or <code>IGNORE<\/code>, which are both optional. The <code>REPLACE<\/code> option tells MySQL to replace existing rows that have the same unique key value. The <code>IGNORE<\/code> option tells MySQL to ignore rows with the same key value. The <code>IGNORE<\/code> option has the same effect as the <code>LOCAL<\/code> option, so if you\u2019re using <code>LOCAL<\/code>, you never need to use <code>IGNORE<\/code>. However, you can use the <code>REPLACE<\/code> option with <code>LOCAL<\/code>.<\/p>\n<p>The <code>INTO<\/code> <code>TABLE<\/code> clause specifies the name of the target table. The main thing here is to be sure that you\u2019ve been granted the privileges necessary to add data to that table.<\/p>\n<p>The <code>FIELDS<\/code> clause comes next, and it supports one or more of the following three subclauses:<\/p>\n<ul>\n<li>The <code>TERMINATED<\/code> <code>BY<\/code> subclause specifies the string used in the text file to terminate each field. The string can be one or more characters. The default value is <code>\\t<\/code> for tab, which means that tabs are used to separate field values.<\/li>\n<li>The <code>ENCLOSED<\/code> <code>BY<\/code> subclause specifies the character used in the text file to enclose values, such as quotation marks around string values. The <code>OPTIONALLY<\/code> keyword, which itself is optional, is used \u201cif the input values are not necessarily enclosed within quotation marks,\u201d according to MySQL documentation. (More on that in a bit.) The default value for the <code>ENCLOSED<\/code> <code>BY<\/code> subclause is an empty string, indicating that the fields are not enclosed in quoting characters.<\/li>\n<li>The <code>ESCAPED<\/code> <code>BY<\/code> subclause specifies the character used in the text file for escaping characters that could impact how MySQL interprets the data. The default value is a backslash (<code>\\<\/code>), which is also used in MySQL to escape characters, including the backslash itself. Many programming languages also use the backslash to escape characters.<\/li>\n<\/ul>\n<p>The <code>FIELDS<\/code> clause is itself optional, but if you include it, you must specify at least one of the subclauses.<\/p>\n<p>Note: The <code>OPTIONALLY<\/code> option in the <code>ENCLOSED<\/code> <code>BY<\/code> subclause is one of the most confusing elements in the <code>LOAD<\/code> <code>DATA<\/code> statement. Its use made no difference in the various tests I ran. For example, in one test, I enclosed all the values in the <code>manufacturer<\/code> fields in double quotation marks except for one. MySQL imported the data correctly whether or not I included the <code>OPTIONALLY<\/code> option. I also tested the option using <code>NULL<\/code> values and empty strings and received the same results. There might be use cases in which the option does make a difference, but I have yet to discover them. However, the <code>FIELDS<\/code> and <code>LINES<\/code> clauses in the <code>LOAD<\/code> <code>DATA<\/code> statement are the same as the <code>SELECT\u2026INTO<\/code> <code>OUTFILE<\/code> statement, and much of the discussion in the MySQL documentation about the <code>OPTIONALLY<\/code> option is related to <code>SELECT\u2026INTO<\/code> <code>OUTFILE<\/code>, so perhaps that is where it is most relevant.<\/p>\n<p>Like the <code>FIELDS<\/code> clause, the <code>LINES<\/code> clause is also optional. The <code>LINES<\/code> clause supports the following two subclauses:<\/p>\n<ul>\n<li>The <code>STARTING<\/code> <code>BY<\/code> clause specifies the common prefix used at the beginning of each line in the text file. The default value is an empty string, indicating that no specific prefix is used. If a prefix is specified and a line does not contain that prefix, MySQL skips the line when importing the data.<\/li>\n<li>The <code>TERMINATED<\/code> <code>BY<\/code> clause specifies the string used in the text file to terminate each line. The string can be one or more characters. The default value is <code>\\n<\/code>, which refers to a newline character (linefeed). I created my text file in Apple\u2019s TextEdit app, so the default worked on my system, but not all systems operate the same. For example, if you create the text files in Windows, you might need to specify <code>'\\r\\n'<\/code> as the <code>TERMINATED<\/code> <code>BY<\/code> value.<\/li>\n<\/ul>\n<p>If you include both the <code>FIELDS<\/code> clause and <code>LINES<\/code> clause, the <code>FIELDS<\/code> clause must come first. The <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause comes after these two clauses. The <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause specifies the number of lines to skip at the beginning of the file when importing the data. The clause is commonly used when the file contains a header row, in which case, the clause would be written as <code>IGNORE<\/code> <code>1<\/code> <code>LINES<\/code>.<\/p>\n<p>The final clause is the list of columns, which are enclosed in parentheses and separated by commas. Although this clause is optional, you will likely include it in most of your statements, unless you\u2019re source data contains a field for every column and the fields are in the same order as the columns.<\/p>\n<p>The <code>LOAD<\/code> <code>DATA<\/code> statement contains a few other clauses, but the ones I\u2019ve shown you here are plenty for you to get started. Even so, I recommend that you review the MySQL topic <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/load-data.html\">LOAD DATA Statement<\/a> to learn more about the statement\u2019s various elements.<\/p>\n<h2>Importing a CSV file<\/h2>\n<p>Now that you\u2019ve been introduced to the <code>LOAD<\/code> <code>DATA<\/code> statement, let\u2019s look at some examples that show it in action. You can refer back to the previous section if needed as you work through the following sections.<\/p>\n<p>In preparation for the first example, I created a file named <code>manufacturers1.csv<\/code> and added the following data:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">101,Airbus\r\n102,Beagle Aircraft Limited\r\n103,Beechcraft\r\n104,Boeing\r\n105,Bombardier\r\n106,Cessna\r\n107,Embraer<\/pre>\n<p>I saved the file to the folder <code>\/Users\/mac3\/Documents\/TravelData\/<\/code> on my local computer. If you plan to try out the examples for yourself, you can save the files to any location on your system that Workbench can access. Just be sure to update the file path in the examples before you run your statements.<\/p>\n<p>After I created the <strong><code>manufacturers1.csv<\/code><\/strong> file, I ran the following <code>LOAD<\/code> <code>DATA<\/code> statement, which saves the data to the <code>manufacturers<\/code> table in the <code>travel<\/code> database:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers1.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' \r\n(manufacturer_id, manufacturer);<\/pre>\n<p>As you can see, the <code>LOAD<\/code> <code>DATA<\/code> clause includes the <code>LOCAL<\/code> option, and the <code>INFILE<\/code> clause specifies the source file. These are followed by the <code>INTO<\/code> <code>TABLE<\/code> clause, which points to the <code>manufacturers<\/code> table.<\/p>\n<p>The next clause, <code>FIELDS<\/code>, includes the <code>TERMINATED<\/code> <code>BY<\/code> subclause, which specifies that a comma is used as the field separator, rather than the default tab. The statement then provides the names of the two target columns\u2014<code>manufacturer_id<\/code> and <code>manufacturer<\/code>\u2014which are enclosed in parentheses.<\/p>\n<p>When you run the statement, MySQL extracts the data from the file and populates the <code>manufacturers<\/code> table. You can verify that the data has been added to the table by running the following <code>SELECT<\/code> statement:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT * FROM manufacturers;<\/pre>\n<p>The <code>SELECT<\/code> statement returns the results shown in the following figure, which indicates that the data was successfully inserted into the table. Keep this statement handy because you can use it to verify your results for the remaining examples.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1356\" height=\"442\" class=\"wp-image-96679\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/04\/word-image-96677-2.png\" \/><\/p>\n<p>To keep things simple for this article, you can also run the following <code>TRUNCATE<\/code> statement to remove the data from the <code>manufacturers<\/code> table in preparation for the next example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">TRUNCATE TABLE manufacturers;<\/pre>\n<p>You should keep this statement handy as well. You\u2019ll want to run it after most of the following examples, except in a couple instances where I demonstrate specific concepts, in which case, I\u2019ll let you know not to run it.<\/p>\n<h2>Ignoring the first lines in an import file<\/h2>\n<p>Some of the source files that you work with might contain a header row that lists the field names or include other types of information, such as comments about when and where the file was generated. You can skip these rows when importing the data by including the <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause in your <code>LOAD<\/code> <code>DATA<\/code> statement.<\/p>\n<p>To see how this works, create a text file named <strong><code>manufacturers2.csv<\/code><\/strong> file, add the following data to the file, and save it to the same location as the <strong><code>manufacturers1.csv<\/code><\/strong> file:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id,manufacturer\r\n101,Airbus\r\n102,Beagle Aircraft Limited\r\n103,Beechcraft\r\n104,Boeing\r\n105,Bombardier\r\n106,Cessna\r\n107,Embraer<\/pre>\n<p>Now run the following <code>LOAD<\/code> <code>DATA<\/code> statement, which includes an <code>IGNORE<\/code> <code>1<\/code> <code>LINES<\/code> clause that tells MySQL to skip the first row:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers2.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' \r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>After you execute the <code>LOAD<\/code> <code>DATA<\/code> statement, you can again run your <code>SELECT<\/code> statement to verify that the correct data has been added. The results should indicate that the header row has been omitted. You can then run your <code>TRUNCATE<\/code> statement again in preparation for the next example.<\/p>\n<p>The <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause is not limited to one row. For instance, the following <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause specifies five rows rather than one:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers2.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' \r\nIGNORE 5 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you run the <code>SELECT<\/code> statement this time, you should get the results shown in the following figure. (Don\u2019t truncate the table for this example or the next one because I want to point out of a couple other issues.)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1356\" height=\"302\" class=\"wp-image-96680\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/04\/word-image-96677-3.png\" \/><\/p>\n<p>As you can see, the table contains only the last three rows from the source file. However, suppose that you were to run the statement again, only this time, specifying only one row in the <code>IGNORE<\/code> <code><em>n<\/em><\/code> <code>LINES<\/code> clause:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers2.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' \r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you execute the statement, MySQL tries to insert all seven rows of data into the target table, but only the first four rows succeed. After running the statement, MySQL returns the following message:<\/p>\n<p><code>4 row(s) affected, 3 warning(s): 1062 Duplicate entry '105' for key 'manufacturers.PRIMARY' 1062 Duplicate entry '106' for key 'manufacturers.PRIMARY' 1062 Duplicate entry '107' for key 'manufacturers.PRIMARY' Records: 7  Deleted: 0  Skipped: 3  Warnings: 3<\/code><\/p>\n<p>The message indicates that the existing rows with <code>manufacturer_id<\/code> values of <code>105<\/code>, <code>106<\/code>, and <code>107<\/code> were skipped. That is, no new rows with these values were inserted into the table. Only the first four rows were added. If you run the <code>SELECT<\/code> statement again, you should receive results similar to those shown in the following figure. (Once again, don\u2019t truncate the table; leave it for the next example.)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1386\" height=\"392\" class=\"wp-image-96681\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/04\/word-image-96677-4.png\" \/><\/p>\n<p>The table now contains all seven rows of data, but if you look closely at the timestamps in the figure, you\u2019ll see that the last three rows precede the first five rows by nearly 30 seconds. (I ran the last two <code>LOAD<\/code> <code>DATA<\/code> statements fairly close together.)<\/p>\n<p>Now suppose you run the same <code>LOAD<\/code> <code>DATA<\/code> statement again, only this time you include the <code>REPLACE<\/code> option:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers2.csv' \r\nREPLACE\r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' \r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you execute the statement, MySQL now returns the following message:<\/p>\n<p><code>14 row(s) affected Records: 7  Deleted: 7  Skipped: 0  Warnings: 0<\/code><\/p>\n<p>The message indicates that 14 rows were processed. However, only seven records were affected, and seven were deleted. This means that the database engine deleted the seven existing records and re-added them to the table. You can verify this be running the <code>SELECT<\/code> statement again. Your results should show different timestamps than in the previous results, with the all values very close, if not the same.<\/p>\n<p>You can now rerun your <code>TRUNCATE<\/code> <code>TABLE<\/code> statement to prepare the <code>manufacturers<\/code> table for the next example.<\/p>\n<h2>Working with quoted fields in the import file<\/h2>\n<p>When importing data, your text files might include some or all fields enclosed in quotation marks. For example, I created the <strong><code>manufacturers3.csv<\/code><\/strong> file using the following data, which includes single quotation marks around the string values:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id,manufacturer\r\n101,'Airbus'\r\n102,'Beagle Aircraft Limited'\r\n103,'Beechcraft'\r\n104,'Boeing'\r\n105,'Bombardier'\r\n106,'Cessna'\r\n107,'Embraer'<\/pre>\n<p>To handle the quoted fields, you can add an <code>ENCLOSED<\/code> <code>BY<\/code> subclause to your <code>FIELDS<\/code> clause, as shown in the following example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers3.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' ENCLOSED BY '\\''\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>The <code>ENCLOSED<\/code> <code>BY<\/code> subclause specifies that a single quotation mark is used to enclose fields. The quotation mark is preceded by a backslash to escape the character when submitting it to the database engine. If you don\u2019t use the <code>ENCLOSED<\/code> <code>BY<\/code> subclause, the database engine will treat the quotation marks as literal values and store them along with the rest of the values.<\/p>\n<p>After you execute the <code>LOAD<\/code> <code>DATA<\/code> statement, you can run your <code>SELECT<\/code> statement to verify the results and then run your <code>TRUNCATE<\/code> statement to prepare the <code>manufacturers<\/code> table for the next example.<\/p>\n<p>When you specify a single quotation mark in the <code>ENCLOSED<\/code> <code>BY<\/code> subclause, you can enclose it in double quotes, rather than escaping it with a backslash:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers3.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' ENCLOSED BY \"'\"\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>In some cases, the text file will use double quotation marks to enclose field values, rather than single quotation marks. To demonstrate how to handle these, I created the <strong><code>manufacturers4.csv<\/code><\/strong> file using the following data:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id,manufacturer\r\n101,\"Airbus\"\r\n102,\"Beagle Aircraft Limited\"\r\n103,\"Beechcraft\"\r\n104,\"Boeing\"\r\n105,\"Bombardier\"\r\n106,\"Cessna\"\r\n107,\"Embraer\"<\/pre>\n<p>To handle this file, the <code>ENCLOSED<\/code> <code>BY<\/code> subclause should be modified to specify a double quotation mark, enclosing it in single quotation marks:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers4.csv' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' ENCLOSED BY '\"'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>After you run this <code>LOAD<\/code> <code>DATA<\/code> statement, you can once again run your <code>SELECT<\/code> statement to verify the results. Once you\u2019ve reviewed them, you can then run your <code>TRUNCATE<\/code> statement in preparation for the next example. (You should do this for all the remaining examples.)<\/p>\n<h2>Working with different formats in your text files<\/h2>\n<p>The text files that you work with might be table-delimited rather than comma-delimited, and they might include other elements that require special handling. Consider the <strong><code>manufacturers5.txt<\/code><\/strong> file, which I created with the following data:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id\tmanufacturer\r\n*,*101\t\"Airbus\"\r\n*,*102\t\"Beagle Aircraft Limited\"\r\n*,*103\t\"Beechcraft\"\r\n*,*104\t\"Boeing\"\r\n*,*105\t\"Bombardier\"\r\n*,*106\t\"Cessna\"\r\n*,*107\t\"Embraer\"<\/pre>\n<p>In this case, a tab is used as the field separator, and each line is preceded by the <code>*,* <\/code>characters. As a result, you don\u2019t need to specify the <code>TERMINATED<\/code> <code>BY<\/code> subclause in the <code>FIELDS<\/code> clause because the tab is the default value, but you do need to take steps to handle the line prefix. For this, you should add a <code>LINES<\/code> clause with a <code>STARTING<\/code> <code>BY<\/code> subclause that specifies the prefix characters:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers5.txt' \r\nINTO TABLE manufacturers \r\nFIELDS ENCLOSED BY '\"'\r\nLINES STARTING BY '*,*'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you run this statement, MySQL will use the prefix characters to determine which rows to add, while stripping out the characters in the process.<\/p>\n<p>As already pointed out, the preceding example does not include a <code>TERMINATED<\/code> <code>BY<\/code> subclause in the <code>FIELDS<\/code> clause. It also does not include a <code>TERMINATED<\/code> <code>BY<\/code> subclause in the <code>LINES<\/code> clause because the text file uses the default linefeed value. However, you can still include both clauses if you want:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers5.txt' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY '\\t' ENCLOSED BY '\"'\r\nLINES TERMINATED BY '\\n' STARTING BY '*,*'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When using the <code>STARTING<\/code> <code>BY<\/code> subclause, be aware that your text file must use these prefixes consistently or you might get unexpected results. For example, the following text file, <strong><code>manufacturers6.txt<\/code><\/strong>, includes a line with two records but no prefix in front of the first record:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id\tmanufacturer\r\n*,*101\t\"Airbus\"\r\n*,*102\t\"Beagle Aircraft Limited\"\r\n*,*103\t\"Beechcraft\"\r\n104\t\"Boeing\" *,*105\t\"Bombardier\"\r\n*,*106\t\"Cessna\"\r\n*,*107\t\"Embraer\"<\/pre>\n<p>After you\u2019ve created the file on your system, you can run the following <code>LOAD<\/code> <code>DATA<\/code> statement to see what happens:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers6.txt' \r\nINTO TABLE manufacturers \r\nFIELDS ENCLOSED BY '\"'\r\nLINES STARTING BY '*,*'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you execute this statement, MySQL skips the record with a <code>manufacturer_id<\/code> value of <code>104<\/code> but still adds the record with a value of <code>105<\/code>. You can verify this by again running your <code>SELECT<\/code> statement, which returns the results shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1374\" height=\"364\" class=\"wp-image-96682\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/04\/word-image-96677-5.png\" \/><\/p>\n<p>In some cases, you might run into text files whose lines are terminated with nontraditional characters (as opposed to the usual linefeeds or returns). For example, I created the <strong><code>manufacturers7.txt<\/code><\/strong> file using the following data, which separates the lines with triple hash marks (<code>###<\/code>):<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id\tmanufacturer###101\t\"Airbus\"###102\t\"Beagle Aircraft Limited\"###103\t\"Beechcraft\"###104\t\"Boeing\"###105\t\"Bombardier\"###106\t\"Cessna\"###107\t\"Embraer\"<\/pre>\n<p>To accommodate this file, you need to include a <code>TERMINATED<\/code> <code>BY<\/code> subclause in your <code>LINES<\/code> clause that specifies the hashmarks:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers7.txt' \r\nINTO TABLE manufacturers \r\nFIELDS ENCLOSED BY '\"'\r\nLINES TERMINATED BY '###'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>When you run this statement, the database engine will know how to interpret the hashmarks and will insert the data accordingly, stripping out the hashmarks in the process.<\/p>\n<p>In some cases, you might also run into a text file that uses a character other than the backslash to escape characters within fields. For example, the <strong><code>manufacturers8.txt<\/code><\/strong> file contains seven lines of comma-delimited fields, one of which includes a comma in the manufacturer name:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">manufacturer_id,manufacturer\r\n101,Airbus\r\n102,Beagle Aircraft Limited\r\n103,Beechcraft\r\n104,Aviat Aircraft^, Inc.\r\n105,Bombardier\r\n106,Cessna\r\n107,Embraer<\/pre>\n<p>In this case, the name\u2019s comma is escaped with a caret (<code>^<\/code>). Because this character is not a backslash (the default escape character), you need to add an <code>ESCAPE<\/code> <code>BY<\/code> clause that specifies the caret, as shown in the following example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">LOAD DATA LOCAL INFILE \r\n  '\/Users\/mac3\/Documents\/TravelData\/manufacturers8.txt' \r\nINTO TABLE manufacturers \r\nFIELDS TERMINATED BY ',' ESCAPED BY'^'\r\nIGNORE 1 LINES\r\n(manufacturer_id, manufacturer);<\/pre>\n<p>If you do not include the <code>ESCAPE<\/code> <code>BY<\/code> clause, the database engine will retain the caret but truncate the manufacturer name, as in <code>Aviat<\/code> <code>Aircraft^<\/code>. However, if you include the clause, MySQL will strip out the caret and treat the comma as a literal value, resulting in a column value of <code>Aviat<\/code> <code>Aircraft,<\/code> <code>Inc<\/code>., rather than the truncated version.<\/p>\n<h2>Getting started with importing data in MySQL<\/h2>\n<p>As mentioned earlier, the <code>LOAD<\/code> <code>DATA<\/code> statement includes other elements than what I\u2019ve shown you here. There are also other options for importing data, such as the <strong>mysqlimport<\/strong> command-line utility, which generates and sends <code>LOAD<\/code> <code>DATA<\/code> statements to the MySQL server. Most of the utility\u2019s options correlate directly to the <code>LOAD<\/code> <code>DATA<\/code> statement. Another option is the <strong>Table<\/strong> <strong>Data<\/strong> <strong>Import<\/strong> wizard in MySQL Workbench. The wizard walks you through the process of importing data from a CSV or JSON file.<\/p>\n<p>If you work with MySQL databases on a regular bases, chances are good that you\u2019ll want to import data from text files, even if only to set up test or development environments. In most cases, what I\u2019ve shown you here will be enough to get you started with the <code>LOAD<\/code> <code>DATA<\/code> statement. Just know that that you might run into situations that I haven\u2019t covered, in which case, it\u2019s always a good idea to review other MySQL documentation to help you fill in the gaps.<\/p>\n<h2>Appendix: Preparing your MySQL environment<\/h2>\n<p>When creating the examples for this article, I used a Mac computer that was set up with a local instance of MySQL 8.0.29 (Community Server edition). I also used MySQL Workbench to interface with MySQL. The examples import data from a set of sample text files that I created in Apple\u2019s text editor, TextEdit.<\/p>\n<p>I provide you with the files\u2019 contents throughout the article, along with the example <code>LOAD<\/code> <code>DATA<\/code> statements. If you plan to try out these examples, you can create the files on your own system as you work through those examples. Before you get started, however, you should run the following script against your MySQL instance:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">DROP DATABASE IF EXISTS travel;\r\n\r\nCREATE DATABASE travel;\r\nUSE travel;\r\nCREATE TABLE manufacturers (\r\n  manufacturer_id INT UNSIGNED NOT NULL,\r\n  manufacturer VARCHAR(50) NOT NULL,\r\n  create_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,\r\n  last_update TIMESTAMP NOT NULL \r\n    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\r\n  PRIMARY KEY (manufacturer_id) );<\/pre>\n<p>The script creates the <code>travel<\/code> database and adds the <code>manufacturers<\/code> table. Otherwise, that\u2019s all you need for the try out the examples (in addition to creating the source text files). For most of the examples, I simply truncated the data to prepare the table for the next example. If you already created the database and table for previous articles, I recommend that you re-create them now or at least truncate the <code>manufacturers<\/code> table before getting started.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Database and development teams often load data from plain text files into their MySQL databases. The files might be used to add lookup data, support test and development environments, populate new MySQL instances, load data from regular feeds, or in other ways support their operations. To help with the import process, MySQL provides the LOAD&#8230;&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,145792],"tags":[],"coauthors":[6779],"class_list":["post-96677","post","type-post","status-publish","format-standard","hentry","category-featured","category-mysql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96677","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=96677"}],"version-history":[{"count":7,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96677\/revisions"}],"predecessor-version":[{"id":97332,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96677\/revisions\/97332"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=96677"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=96677"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=96677"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=96677"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}