MySQL Reference Manual for version 3.23.51. - I Description of MySQL regular expression syntax
Go to the first, previous, next, last section, table of contents.
A regular expression (regex) is a powerful way of specifying a complex search.
MySQL uses Henry Spencer's implementation of regular
expressions, which is aimed at conformance with POSIX
1003.2. MySQL uses the extended version.
This is a simplistic reference that skips the details. To get more exact
information, see Henry Spencer's regex(7) manual page that is
included in the source distribution. See section E Credits.
A regular expression describes a set of strings. The simplest regexp is
one that has no special characters in it. For example, the regexp
hello matches hello and nothing else.
Non-trivial regular expressions use certain special constructs so that
they can match more than one string. For example, the regexp
hello|word matches either the string hello or the string
word.
As a more complex example, the regexp B[an]*s matches any of the
strings Bananas, Baaaaas, Bs, and any other string
starting with a B, ending with an s, and containing any
number of a or n characters in between.
A regular expression may use any of the following special
characters/constructs:
The is a more general way of writing regexps that match many
occurrences of the previous atom.
a*
Can be written as a{0,}.
a+
Can be written as a{1,}.
a?
Can be written as a{0,1}.
To be more precise, an atom followed by a bound containing one integer
i and no comma matches a sequence of exactly i matches of
the atom. An atom followed by a bound containing one integer i
and a comma matches a sequence of i or more matches of the atom.
An atom followed by a bound containing two integers i and
j matches a sequence of i through j (inclusive)
matches of the atom.
Both arguments must be in the range from 0 to RE_DUP_MAX
(default 255), inclusive. If there are two arguments, the second must be
greater than or equal to the first.
[a-dX]
[^a-dX]
Matches
any character which is (or is not, if ^ is used) either a, b,
c, d or X. To include a literal ] character,
it must immediately follow the opening bracket [. To include a
literal - character, it must be written first or last. So
[0-9] matches any decimal digit. Any character that does not have
a defined meaning inside a [] pair has no special meaning and
matches only itself.
The sequence of characters of that collating element. The sequence is a
single element of the bracket expression's list. A bracket expression
containing a multi-character collating element can thus match more than
one character, for example, if the collating sequence includes a ch
collating element, then the regular expression [[.ch.]]*c matches the
first five characters of chchcc.
[=character_class=]
An equivalence class, standing for the sequences of characters of all
collating elements equivalent to that one, including itself.
For example, if o and (+) are the members of an
equivalence class, then [[=o=]], [[=(+)=]], and
[o(+)] are all synonymous. An equivalence class may not be an
endpoint of a range.
[:character_class:]
Within a bracket expression, the name of a character class enclosed in
[: and :] stands for the list of all characters belonging
to that class. Standard character class names are:
alnum
digit
punct
alpha
graph
space
blank
lower
upper
cntrl
print
xdigit
These stand for the character classes defined in the ctype(3) manual
page. A locale may provide others. A character class may not be used as an
endpoint of a range.
These match the null string at the beginning and end of a word
respectively. A word is defined as a sequence of word characters which
is neither preceded nor followed by word characters. A word character is
an alnum character (as defined by ctype(3)) or an underscore
(_).
mysql> select "a word a" REGEXP "[[:<:]]word[[:>:]]"; -> 1
mysql> select "a xword a" REGEXP "[[:<:]]word[[:>:]]"; -> 0