GNU Info

Info Node: (mysql.info)LOAD DATA

(mysql.info)LOAD DATA


Next: DO Prev: REPLACE Up: Data Manipulation
Enter node , (file) or (file)node

`LOAD DATA INFILE' Syntax
-------------------------

     LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
         [REPLACE | IGNORE]
         INTO TABLE tbl_name
         [FIELDS
             [TERMINATED BY '\t']
             [[OPTIONALLY] ENCLOSED BY '']
             [ESCAPED BY '\\' ]
         ]
         [LINES TERMINATED BY '\n']
         [IGNORE number LINES]
         [(col_name,...)]

The `LOAD DATA INFILE' statement reads rows from a text file into a
table at a very high speed.  If the `LOCAL' keyword is specified, the
file is read from the client host.  If `LOCAL' is not specified, the
file must be located on the server.  (`LOCAL' is available in MySQL
Version 3.22.6 or later.)

For security reasons, when reading text files located on the server, the
files must either reside in the database directory or be readable by
all.  Also, to use `LOAD DATA INFILE' on server files, you must have the
*file* privilege on the server host.  Note: Privileges provided.

If you specify the keyword `LOW_PRIORITY', execution of the `LOAD DATA'
statement is delayed until no other clients are reading from the table.

If you specify the keyword `CONCURRENT' with a `MyISAM' table, then
other threads can retrieve data from the table while `LOAD DATA' is
executing. Using this option will of course affect the performance of
`LOAD DATA' a bit even if no other thread is using the table at the
same time.

Using `LOCAL' will be a bit slower than letting the server access the
files directly, because the contents of the file must travel from the
client host to the server host.  On the other hand, you do not need the
*file* privilege to load local files.

If you are using MySQL before Version 3.23.24 you can't read from a
FIFO with `LOAD DATA INFILE'.  If you need to read from a FIFO (for
example the output from gunzip), use `LOAD DATA LOCAL INFILE' instead.

You can also load data files by using the `mysqlimport' utility; it
operates by sending a `LOAD DATA INFILE' command to the server.  The
`--local' option causes `mysqlimport' to read data files from the
client host.  You can specify the `--compress' option to get better
performance over slow networks if the client and server support the
compressed protocol.

When locating files on the server host, the server uses the following
rules:

   * If an absolute pathname is given, the server uses the pathname as
     is.

   * If a relative pathname with one or more leading components is
     given, the server searches for the file relative to the server's
     data directory.

   * If a filename with no leading components is given, the server
     looks for the file in the database directory of the current
     database.

Note that these rules mean a file given as `./myfile.txt' is read from
the server's data directory, whereas a file given as `myfile.txt' is
read from the database directory of the current database.  For example,
the following `LOAD DATA' statement reads the file `data.txt' from the
database directory for `db1' because `db1' is the current database,
even though the statement explicitly loads the file into a table in the
`db2' database:

     mysql> USE db1;
     mysql> LOAD DATA INFILE "data.txt" INTO TABLE db2.my_table;

The `REPLACE' and `IGNORE' keywords control handling of input records
that duplicate existing records on unique key values.  If you specify
`REPLACE', new rows replace existing rows that have the same unique key
value. If you specify `IGNORE', input rows that duplicate an existing
row on a unique key value are skipped.  If you don't specify either
option, an error occurs when a duplicate key value is found, and the
rest of the text file is ignored.

If you load data from a local file using the `LOCAL' keyword, the server
has no way to stop transmission of the file in the middle of the
operation, so the default bahavior is the same as if `IGNORE' is
specified.

If you use `LOAD DATA INFILE' on an empty `MyISAM' table, all
non-unique indexes are created in a separate batch (like in `REPAIR').
This normally makes `LOAD DATA INFILE' much faster when you have many
indexes.

`LOAD DATA INFILE' is the complement of `SELECT ... INTO OUTFILE'.
Note: `SELECT'.  To write data from a database to a file, use
`SELECT ... INTO OUTFILE'.  To read the file back into the database,
use `LOAD DATA INFILE'.  The syntax of the `FIELDS' and `LINES' clauses
is the same for both commands.  Both clauses are optional, but `FIELDS'
must precede `LINES' if both are specified.

If you specify a `FIELDS' clause, each of its subclauses (`TERMINATED
BY', `[OPTIONALLY] ENCLOSED BY', and `ESCAPED BY') is also optional,
except that you must specify at least one of them.

If you don't specify a `FIELDS' clause, the defaults are the same as if
you had written this:

     FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'

If you don't specify a `LINES' clause, the default is the same as if
you had written this:

     LINES TERMINATED BY '\n'

In other words, the defaults cause `LOAD DATA INFILE' to act as follows
when reading input:

   * Look for line boundaries at newlines.

   * Break lines into fields at tabs.

   * Do not expect fields to be enclosed within any quoting characters.

   * Interpret occurrences of tab, newline, or `\' preceded by `\' as
     literal characters that are part of field values.

Conversely, the defaults cause `SELECT ... INTO OUTFILE' to act as
follows when writing output:

   * Write tabs between fields.

   * Do not enclose fields within any quoting characters.

   * Use `\' to escape instances of tab, newline or `\' that occur
     within field values.

   * Write newlines at the ends of lines.

Note that to write `FIELDS ESCAPED BY '\\'', you must specify two
backslashes for the value to be read as a single backslash.

The `IGNORE number LINES' option can be used to ignore a header of
column names at the start of the file:

     mysql> LOAD DATA INFILE "/tmp/file_name" into table test IGNORE 1 LINES;

When you use `SELECT ... INTO OUTFILE' in tandem with `LOAD DATA
INFILE' to write data from a database into a file and then read the
file back into the database later, the field and line handling options
for both commands must match.  Otherwise, `LOAD DATA INFILE' will not
interpret the contents of the file properly.  Suppose you use `SELECT
... INTO OUTFILE' to write a file with fields delimited by commas:

     mysql> SELECT * INTO OUTFILE 'data.txt'
                FIELDS TERMINATED BY ','
                FROM ...;

To read the comma-delimited file back in, the correct statement would
be:

     mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
                FIELDS TERMINATED BY ',';

If instead you tried to read in the file with the statement shown
below, it wouldn't work because it instructs `LOAD DATA INFILE' to look
for tabs between fields:

     mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
                FIELDS TERMINATED BY '\t';

The likely result is that each input line would be interpreted as a
single field.

`LOAD DATA INFILE' can be used to read files obtained from external
sources, too. For example, a file in dBASE format will have fields
separated by commas and enclosed in double quotes.  If lines in the
file are terminated by newlines, the command shown below illustrates
the field and line handling options you would use to load the file:

     mysql> LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
                FIELDS TERMINATED BY ',' ENCLOSED BY '"'
                LINES TERMINATED BY '\n';

Any of the field or line handling options may specify an empty string
(`''').  If not empty, the `FIELDS [OPTIONALLY] ENCLOSED BY' and
`FIELDS ESCAPED BY' values must be a single character.  The `FIELDS
TERMINATED BY' and `LINES TERMINATED BY' values may be more than one
character.  For example, to write lines that are terminated by carriage
return-linefeed pairs, or to read a file containing such lines, specify
a `LINES TERMINATED BY '\r\n'' clause.

For example, to read a file of jokes, that are separated with a line of
`%%', into a SQL table you can do:

     create table jokes (a int not null auto_increment primary key, joke text
     not null);
     load data infile "/tmp/jokes.txt" into table jokes fields terminated by ""
     lines terminated by "\n%%\n" (joke);

`FIELDS [OPTIONALLY] ENCLOSED BY' controls quoting of fields.  For
output (`SELECT ... INTO OUTFILE'), if you omit the word `OPTIONALLY',
all fields are enclosed by the `ENCLOSED BY' character.  An example of
such output (using a comma as the field delimiter) is shown below:

     "1","a string","100.20"
     "2","a string containing a , comma","102.20"
     "3","a string containing a \" quote","102.20"
     "4","a string containing a \", quote and comma","102.20"

If you specify `OPTIONALLY', the `ENCLOSED BY' character is used only
to enclose `CHAR' and `VARCHAR' fields:

     1,"a string",100.20
     2,"a string containing a , comma",102.20
     3,"a string containing a \" quote",102.20
     4,"a string containing a \", quote and comma",102.20

Note that occurrences of the `ENCLOSED BY' character within a field
value are escaped by prefixing them with the `ESCAPED BY' character.
Also note that if you specify an empty `ESCAPED BY' value, it is
possible to generate output that cannot be read properly by `LOAD DATA
INFILE'. For example, the output just shown above would appear as shown
below if the escape character is empty.  Observe that the second field
in the fourth line contains a comma following the quote, which
(erroneously) appears to terminate the field:

     1,"a string",100.20
     2,"a string containing a , comma",102.20
     3,"a string containing a " quote",102.20
     4,"a string containing a ", quote and comma",102.20

For input, the `ENCLOSED BY' character, if present, is stripped from the
ends of field values.  (This is true whether or not `OPTIONALLY' is
specified; `OPTIONALLY' has no effect on input interpretation.)
Occurrences of the `ENCLOSED BY' character preceded by the `ESCAPED BY'
character are interpreted as part of the current field value.  In
addition, duplicated `ENCLOSED BY' characters occurring within fields
are interpreted as single `ENCLOSED BY' characters if the field itself
starts with that character.  For example, if `ENCLOSED BY '"'' is
specified, quotes are handled as shown below:

     "The ""BIG"" boss"  -> The "BIG" boss
     The "BIG" boss      -> The "BIG" boss
     The ""BIG"" boss    -> The ""BIG"" boss

`FIELDS ESCAPED BY' controls how to write or read special characters.
If the `FIELDS ESCAPED BY' character is not empty, it is used to prefix
the following characters on output:
   * The `FIELDS ESCAPED BY' character

   * The `FIELDS [OPTIONALLY] ENCLOSED BY' character

   * The first character of the `FIELDS TERMINATED BY' and `LINES
     TERMINATED BY' values

   * ASCII `0' (what is actually written following the escape character
     is ASCII `'0'', not a zero-valued byte)

If the `FIELDS ESCAPED BY' character is empty, no characters are
escaped.  It is probably not a good idea to specify an empty escape
character, particularly if field values in your data contain any of the
characters in the list just given.

For input, if the `FIELDS ESCAPED BY' character is not empty,
occurrences of that character are stripped and the following character
is taken literally as part of a field value.  The exceptions are an
escaped `0' or `N' (for example, `\0' or `\N' if the escape character is
`\').  These sequences are interpreted as ASCII `0' (a zero-valued
byte) and `NULL'.  See below for the rules on `NULL' handling.

For more information about `\'-escape syntax, see Note: Literals.

In certain cases, field and line handling options interact:

   * If `LINES TERMINATED BY' is an empty string and `FIELDS TERMINATED
     BY' is non-empty, lines are also terminated with `FIELDS
     TERMINATED BY'.

   * If the `FIELDS TERMINATED BY' and `FIELDS ENCLOSED BY' values are
     both empty (`'''), a fixed-row (non-delimited) format is used.
     With fixed-row format, no delimiters are used between fields.
     Instead, column values are written and read using the "display"
     widths of the columns.  For example, if a column is declared as
     `INT(7)', values for the column are written using 7-character
     fields.  On input, values for the column are obtained by reading 7
     characters.  Fixed-row format also affects handling of `NULL'
     values; see below.  Note that fixed-size format will not work if
     you are using a multi-byte character set.

Handling of `NULL' values varies, depending on the `FIELDS' and `LINES'
options you use:

   * For the default `FIELDS' and `LINES' values, `NULL' is written as
     `\N' for output and `\N' is read as `NULL' for input (assuming the
     `ESCAPED BY' character is `\').

   * If `FIELDS ENCLOSED BY' is not empty, a field containing the
     literal word `NULL' as its value is read as a `NULL' value (this
     differs from the word `NULL' enclosed within `FIELDS ENCLOSED BY'
     characters, which is read as the string `'NULL'').

   * If `FIELDS ESCAPED BY' is empty, `NULL' is written as the word
     `NULL'.

   * With fixed-row format (which happens when `FIELDS TERMINATED BY'
     and `FIELDS ENCLOSED BY' are both empty), `NULL' is written as an
     empty string.  Note that this causes both `NULL' values and empty
     strings in the table to be indistinguishable when written to the
     file because they are both written as empty strings.  If you need
     to be able to tell the two apart when reading the file back in,
     you should not use fixed-row format.

Some cases are not supported by `LOAD DATA INFILE':
   * Fixed-size rows (`FIELDS TERMINATED BY' and `FIELDS ENCLOSED BY'
     both empty) and `BLOB' or `TEXT' columns.

   * If you specify one separator that is the same as or a prefix of
     another, `LOAD DATA INFILE' won't be able to interpret the input
     properly.  For example, the following `FIELDS' clause would cause
     problems:

          FIELDS TERMINATED BY '"' ENCLOSED BY '"'

   * If `FIELDS ESCAPED BY' is empty, a field value that contains an
     occurrence of `FIELDS ENCLOSED BY' or `LINES TERMINATED BY'
     followed by the `FIELDS TERMINATED BY' value will cause `LOAD DATA
     INFILE' to stop reading a field or line too early.  This happens
     because `LOAD DATA INFILE' cannot properly determine where the
     field or line value ends.

The following example loads all columns of the `persondata' table:

     mysql> LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;

No field list is specified, so `LOAD DATA INFILE' expects input rows to
contain a field for each table column.  The default `FIELDS' and
`LINES' values are used.

If you wish to load only some of a table's columns, specify a field
list:

     mysql> LOAD DATA INFILE 'persondata.txt'
                INTO TABLE persondata (col1,col2,...);

You must also specify a field list if the order of the fields in the
input file differs from the order of the columns in the table.
Otherwise, MySQL cannot tell how to match up input fields with table
columns.

If a row has too few fields, the columns for which no input field is
present are set to default values.  Default value assignment is
described in Note: `CREATE TABLE'.

An empty field value is interpreted differently than if the field value
is missing:

   * For string types, the column is set to the empty string.

   * For numeric types, the column is set to `0'.

   * For date and time types, the column is set to the appropriate
     "zero" value for the type.  Note: Date and time types.

Note that these are the same values that result if you assign an empty
string explicitly to a string, numeric, or date or time type explicitly
in an `INSERT' or `UPDATE' statement.

`TIMESTAMP' columns are only set to the current date and time if there
is a `NULL' value for the column, or (for the first `TIMESTAMP' column
only) if the `TIMESTAMP' column is left out from the field list when a
field list is specified.

If an input row has too many fields, the extra fields are ignored and
the number of warnings is incremented.

`LOAD DATA INFILE' regards all input as strings, so you can't use
numeric values for `ENUM' or `SET' columns the way you can with
`INSERT' statements.  All `ENUM' and `SET' values must be specified as
strings!

If you are using the C API, you can get information about the query by
calling the API function `mysql_info()' when the `LOAD DATA INFILE'
query finishes.  The format of the information string is shown below:

     Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

Warnings occur under the same circumstances as when values are inserted
via the `INSERT' statement (Note: `INSERT'.), except that `LOAD
DATA INFILE' also generates warnings when there are too few or too many
fields in the input row.  The warnings are not stored anywhere; the
number of warnings can only be used as an indication if everything went
well.  If you get warnings and want to know exactly why you got them,
one way to do this is to use `SELECT ... INTO OUTFILE' into another
file and compare this to your original input file.

If you need `LOAD DATA' to read from a pipe, you can use the following
trick:

     mkfifo /mysql/db/x/x
     chmod 666 /mysql/db/x/x
     cat < /dev/tcp/10.1.1.12/4711 > /nt/mysql/db/x/x
     mysql -e "LOAD DATA INFILE 'x' INTO TABLE x" x

If you are using a version of MySQL older than 3.23.25 you can only do
the above with `LOAD DATA LOCAL INFILE'.

For more information about the efficiency of `INSERT' versus `LOAD DATA
INFILE' and speeding up `LOAD DATA INFILE', Note: Insert speed.


automatically generated by info2www version 1.2.2.9