GNU Info

Info Node: (mysql.info)EXPLAIN

(mysql.info)EXPLAIN


Next: Estimating performance Prev: Query Speed Up: Query Speed
Enter node , (file) or (file)node

`EXPLAIN' Syntax (Get Information About a `SELECT')
---------------------------------------------------

         EXPLAIN tbl_name
     or  EXPLAIN SELECT select_options

`EXPLAIN tbl_name' is a synonym for `DESCRIBE tbl_name' or `SHOW
COLUMNS FROM tbl_name'.

When you precede a `SELECT' statement with the keyword `EXPLAIN', MySQL
explains how it would process the `SELECT', providing information about
how tables are joined and in which order.

With the help of `EXPLAIN', you can see when you must add indexes to
tables to get a faster `SELECT' that uses indexes to find the records.
You can also see if the optimizer joins the tables in an optimal order.
To force the optimizer to use a specific join order for a `SELECT'
statement, add a `STRAIGHT_JOIN' clause.

For non-simple joins, `EXPLAIN' returns a row of information for each
table used in the `SELECT' statement. The tables are listed in the order
they would be read.  MySQL resolves all joins using a single-sweep
multi-join method. This means that MySQL reads a row from the first
table, then finds a matching row in the second table, then in the third
table and so on. When all tables are processed, it outputs the selected
columns and backtracks through the table list until a table is found
for which there are more matching rows. The next row is read from this
table and the process continues with the next table.

Output from `EXPLAIN' includes the following columns:

`table'
     The table to which the row of output refers.

`type'
     The join type.  Information about the various types is given below.

`possible_keys'
     The `possible_keys' column indicates which indexes MySQL could use
     to find the rows in this table. Note that this column is totally
     independent of the order of the tables. That means that some of
     the keys in possible_keys may not be usable in practice with the
     generated table order.

     If this column is empty, there are no relevant indexes. In this
     case, you may be able to improve the performance of your query by
     examining the `WHERE' clause to see if it refers to some column or
     columns that would be suitable for indexing.  If so, create an
     appropriate index and check the query with `EXPLAIN' again. Note:
     ALTER TABLE.

     To see what indexes a table has, use `SHOW INDEX FROM tbl_name'.

`key'
     The `key' column indicates the key that MySQL actually decided to
     use. The key is `NULL' if no index was chosen.  If MySQL chooses
     the wrong index, you can probably force MySQL to use another index
     by using `myisamchk --analyze', Note: myisamchk syntax, or by
     using `USE INDEX/IGNORE INDEX'.  Note: JOIN.

`key_len'
     The `key_len' column indicates the length of the key that MySQL
     decided to use.  The length is `NULL' if the `key' is `NULL'. Note
     that this tells us how many parts of a multi-part key MySQL will
     actually use.

`ref'
     The `ref' column shows which columns or constants are used with the
     `key' to select rows from the table.

`rows'
     The `rows' column indicates the number of rows MySQL believes it
     must examine to execute the query.

`Extra'
     This column contains additional information of how MySQL will
     resolve the query. Here is an explanation of the different text
     strings that can be found in this column:

    `Distinct'
          MySQL will not continue searching for more rows for the
          current row combination after it has found the first matching
          row.

    `Not exists'
          MySQL was able to do a `LEFT JOIN' optimization on the query
          and will not examine more rows in this table for the previous
          row combination after it finds one row that matches the `LEFT
          JOIN' criteria.

          Here is an example for this:

               SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;

          Assume that `t2.id' is defined with `NOT NULL'.  In this case
          MySQL will scan `t1' and look up the rows in `t2' through
          `t1.id'. If MySQL finds a matching row in `t2', it knows that
          `t2.id' can never be `NULL', and will not scan through the
          rest of the rows in `t2' that has the same `id'.  In other
          words, for each row in `t1', MySQL only needs to do a single
          lookup in `t2', independent of how many matching rows there
          are in `t2'.

    ``range checked for each record (index map: #)''
          MySQL didn't find a real good index to use. It will, instead,
          for each row combination in the preceding tables, do a check
          on which index to use (if any), and use this index to
          retrieve the rows from the table.  This isn't very fast but
          is faster than having to do a join without an index.

    `Using filesort'
          MySQL will need to do an extra pass to find out how to
          retrieve the rows in sorted order.  The sort is done by going
          through all rows according to the `join type' and storing the
          sort key + pointer to the row for all rows that match the
          `WHERE'. Then the keys are sorted. Finally the rows are
          retrieved in sorted order.

    `Using index'
          The column information is retrieved from the table using only
          information in the index tree without having to do an
          additional seek to read the actual row.  This can be done
          when all the used columns for the table are part of the same
          index.

    `Using temporary'
          To resolve the query MySQL will need to create a temporary
          table to hold the result.  This typically happens if you do an
          `ORDER BY' on a different column set than you did a `GROUP
          BY' on.

    `Where used'
          A `WHERE' clause will be used to restrict which rows will be
          matched against the next table or sent to the client.  If you
          don't have this information and the table is of type `ALL' or
          `index', you may have something wrong in your query (if you
          don't intend to fetch/examine all rows from the table).

     If you want to get your queries as fast as possible, you should
     look out for `Using filesort' and `Using temporary'.

The different join types are listed below, ordered from best to worst
type:

`system'
     The table has only one row (= system table). This is a special
     case of the `const' join type.

`const'
     The table has at most one matching row, which will be read at the
     start of the query. Because there is only one row, values from the
     column in this row can be regarded as constants by the rest of the
     optimizer. `const' tables are very fast as they are read only once!

`eq_ref'
     One row will be read from this table for each combination of rows
     from the previous tables.  This is the best possible join type,
     other than the `const' types.  It is used when all parts of an
     index are used by the join and the index is `UNIQUE' or a `PRIMARY
     KEY'.

`ref'
     All rows with matching index values will be read from this table
     for each combination of rows from the previous tables.  `ref' is
     used if the join uses only a leftmost prefix of the key, or if the
     key is not `UNIQUE' or a `PRIMARY KEY' (in other words, if the
     join cannot select a single row based on the key value).  If the
     key that is used matches only a few rows, this join type is good.

`range'
     Only rows that are in a given range will be retrieved, using an
     index to select the rows.  The `key' column indicates which index
     is used.  The `key_len' contains the longest key part that was
     used.  The `ref' column will be NULL for this type.

`index'
     This is the same as `ALL', except that only the index tree is
     scanned.  This is usually faster than `ALL', as the index file is
     usually smaller than the data file.

`ALL'
     A full table scan will be done for each combination of rows from
     the previous tables.  This is normally not good if the table is
     the first table not marked `const', and usually *very* bad in all
     other cases. You normally can avoid `ALL' by adding more indexes,
     so that the row can be retrieved based on constant values or
     column values from earlier tables.

You can get a good indication of how good a join is by multiplying all
values in the `rows' column of the `EXPLAIN' output. This should tell
you roughly how many rows MySQL must examine to execute the query. This
number is also used when you restrict queries with the `max_join_size'
variable.  Note: Server parameters.

The following example shows how a `JOIN' can be optimized progressively
using the information provided by `EXPLAIN'.

Suppose you have the `SELECT' statement shown below, that you examine
using `EXPLAIN':

     EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
                 tt.ProjectReference, tt.EstimatedShipDate,
                 tt.ActualShipDate, tt.ClientID,
                 tt.ServiceCodes, tt.RepetitiveID,
                 tt.CurrentProcess, tt.CurrentDPPerson,
                 tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
                 et_1.COUNTRY, do.CUSTNAME
             FROM tt, et, et AS et_1, do
             WHERE tt.SubmitTime IS NULL
                 AND tt.ActualPC = et.EMPLOYID
                 AND tt.AssignedPC = et_1.EMPLOYID
                 AND tt.ClientID = do.CUSTNMBR;

For this example, assume that:

   * The columns being compared have been declared as follows:

     *Table* *Column*      *Column type*
     `tt'    `ActualPC'    `CHAR(10)'
     `tt'    `AssignedPC'  `CHAR(10)'
     `tt'    `ClientID'    `CHAR(10)'
     `et'    `EMPLOYID'    `CHAR(15)'
     `do'    `CUSTNMBR'    `CHAR(15)'

   * The tables have the indexes shown below:

     *Table* *Index*
     `tt'    `ActualPC'
     `tt'    `AssignedPC'
     `tt'    `ClientID'
     `et'    `EMPLOYID' (primary key)
     `do'    `CUSTNMBR' (primary key)

   * The `tt.ActualPC' values aren't evenly distributed.

Initially, before any optimizations have been performed, the `EXPLAIN'
statement produces the following information:

     table type possible_keys                key  key_len ref  rows  Extra
     et    ALL  PRIMARY                      NULL NULL    NULL 74
     do    ALL  PRIMARY                      NULL NULL    NULL 2135
     et_1  ALL  PRIMARY                      NULL NULL    NULL 74
     tt    ALL  AssignedPC,ClientID,ActualPC NULL NULL    NULL 3872
           range checked for each record (key map: 35)

Because `type' is `ALL' for each table, this output indicates that
MySQL is doing a full join for all tables!  This will take quite a long
time, as the product of the number of rows in each table must be
examined!  For the case at hand, this is `74 * 2135 * 74 * 3872 =
45,268,558,720' rows.  If the tables were bigger, you can only imagine
how long it would take.

One problem here is that MySQL can't (yet) use indexes on columns
efficiently if they are declared differently.  In this context,
`VARCHAR' and `CHAR' are the same unless they are declared as different
lengths. Because `tt.ActualPC' is declared as `CHAR(10)' and
`et.EMPLOYID' is declared as `CHAR(15)', there is a length mismatch.

To fix this disparity between column lengths, use `ALTER TABLE' to
lengthen `ActualPC' from 10 characters to 15 characters:

     mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);

Now `tt.ActualPC' and `et.EMPLOYID' are both `VARCHAR(15)'.  Executing
the `EXPLAIN' statement again produces this result:

     table type   possible_keys   key     key_len ref         rows    Extra
     tt    ALL    AssignedPC,ClientID,ActualPC NULL NULL NULL 3872    where used
     do    ALL    PRIMARY         NULL    NULL    NULL        2135
           range checked for each record (key map: 1)
     et_1  ALL    PRIMARY         NULL    NULL    NULL        74
           range checked for each record (key map: 1)
     et    eq_ref PRIMARY         PRIMARY 15      tt.ActualPC 1

This is not perfect, but is much better (the product of the `rows'
values is now less by a factor of 74). This version is executed in a
couple of seconds.

A second alteration can be made to eliminate the column length
mismatches for the `tt.AssignedPC = et_1.EMPLOYID' and `tt.ClientID =
do.CUSTNMBR' comparisons:

     mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
                           MODIFY ClientID   VARCHAR(15);

Now `EXPLAIN' produces the output shown below:

     table type   possible_keys   key     key_len ref            rows     Extra
     et    ALL    PRIMARY         NULL    NULL    NULL           74
     tt    ref    AssignedPC,ClientID,ActualPC ActualPC 15 et.EMPLOYID 52 where used
     et_1  eq_ref PRIMARY         PRIMARY 15      tt.AssignedPC  1
     do    eq_ref PRIMARY         PRIMARY 15      tt.ClientID    1

This is almost as good as it can get.

The remaining problem is that, by default, MySQL assumes that values in
the `tt.ActualPC' column are evenly distributed, and that isn't the
case for the `tt' table.  Fortunately, it is easy to tell MySQL about
this:

     shell> myisamchk --analyze PATH_TO_MYSQL_DATABASE/tt
     shell> mysqladmin refresh

Now the join is perfect, and `EXPLAIN' produces this result:

     table type   possible_keys   key     key_len ref            rows    Extra
     tt    ALL    AssignedPC,ClientID,ActualPC NULL NULL NULL    3872    where used
     et    eq_ref PRIMARY         PRIMARY 15      tt.ActualPC    1
     et_1  eq_ref PRIMARY         PRIMARY 15      tt.AssignedPC  1
     do    eq_ref PRIMARY         PRIMARY 15      tt.ClientID    1

Note that the `rows' column in the output from `EXPLAIN' is an educated
guess from the MySQL join optimizer. To optimize a query, you should
check if the numbers are even close to the truth.  If not, you may get
better performance by using `STRAIGHT_JOIN' in your `SELECT' statement
and trying to list the tables in a different order in the `FROM' clause.


automatically generated by info2www version 1.2.2.9