GNU Info

Info Node: (mysql.info)Regexp

(mysql.info)Regexp


Next: Unireg Prev: Environment variables Up: Top
Enter node , (file) or (file)node

Description of MySQL regular expression syntax
**********************************************

A regular expression (regex) is a powerful way of specifying a complex
search.

MySQL uses Henry Spencer's implementation of regular expressions, which
is aimed at conformance with POSIX 1003.2. MySQL uses the extended
version.

This is a simplistic reference that skips the details. To get more exact
information, see Henry Spencer's `regex(7)' manual page that is
included in the source distribution. Note: Credits.

A regular expression describes a set of strings. The simplest regexp is
one that has no special characters in it. For example, the regexp
`hello' matches `hello' and nothing else.

Non-trivial regular expressions use certain special constructs so that
they can match more than one string. For example, the regexp
`hello|word' matches either the string `hello' or the string `word'.

As a more complex example, the regexp `B[an]*s' matches any of the
strings `Bananas', `Baaaaas', `Bs', and any other string starting with
a `B', ending with an `s', and containing any number of `a' or `n'
characters in between.

A regular expression may use any of the following special
characters/constructs:
`^'
     Match the beginning of a string.
          mysql> select "fo\nfo" REGEXP "^fo$";           -> 0
          mysql> select "fofo" REGEXP "^fo";              -> 1

`$'
     Match the end of a string.
          mysql> select "fo\no" REGEXP "^fo\no$";         -> 1
          mysql> select "fo\no" REGEXP "^fo$";            -> 0

`.'
     Match any character (including newline).
          mysql> select "fofo" REGEXP "^f.*";             -> 1
          mysql> select "fo\nfo" REGEXP "^f.*";           -> 1

`a*'
     Match any sequence of zero or more `a' characters.
          mysql> select "Ban" REGEXP "^Ba*n";             -> 1
          mysql> select "Baaan" REGEXP "^Ba*n";           -> 1
          mysql> select "Bn" REGEXP "^Ba*n";              -> 1

`a+'
     Match any sequence of one or more `a' characters.
          mysql> select "Ban" REGEXP "^Ba+n";             -> 1
          mysql> select "Bn" REGEXP "^Ba+n";              -> 0

`a?'
     Match either zero or one `a' character.
          mysql> select "Bn" REGEXP "^Ba?n";              -> 1
          mysql> select "Ban" REGEXP "^Ba?n";             -> 1
          mysql> select "Baan" REGEXP "^Ba?n";            -> 0

`de|abc'
     Match either of the sequences `de' or `abc'.
          mysql> select "pi" REGEXP "pi|apa";             -> 1
          mysql> select "axe" REGEXP "pi|apa";            -> 0
          mysql> select "apa" REGEXP "pi|apa";            -> 1
          mysql> select "apa" REGEXP "^(pi|apa)$";        -> 1
          mysql> select "pi" REGEXP "^(pi|apa)$";         -> 1
          mysql> select "pix" REGEXP "^(pi|apa)$";        -> 0

`(abc)*'
     Match zero or more instances of the sequence `abc'.
          mysql> select "pi" REGEXP "^(pi)*$";            -> 1
          mysql> select "pip" REGEXP "^(pi)*$";           -> 0
          mysql> select "pipi" REGEXP "^(pi)*$";          -> 1

`{1}'
`{2,3}'
     The is a more general way of writing regexps that match many
     occurrences of the previous atom.
    `a*'
          Can be written as `a{0,}'.

    `a+'
          Can be written as `a{1,}'.

    `a?'
          Can be written as `a{0,1}'.  To be more precise, an atom
     followed by a bound containing one integer `i' and no comma
     matches a sequence of exactly `i' matches of the atom. An atom
     followed by a bound containing one integer `i' and a comma matches
     a sequence of `i' or more matches of the atom.  An atom followed
     by a bound containing two integers `i' and `j' matches a sequence
     of `i' through `j' (inclusive) matches of the atom.

     Both arguments must be in the range from `0' to `RE_DUP_MAX'
     (default 255), inclusive.  If there are two arguments, the second
     must be greater than or equal to the first.

`[a-dX]'
`[^a-dX]'
     Matches any character which is (or is not, if ^ is used) either
     `a', `b', `c', `d' or `X'. To include a literal `]' character, it
     must immediately follow the opening bracket `['.  To include a
     literal `-' character, it must be written first or last. So
     `[0-9]' matches any decimal digit. Any character that does not have
     a defined meaning inside a `[]' pair has no special meaning and
     matches only itself.
          mysql> select "aXbc" REGEXP "[a-dXYZ]";         -> 1
          mysql> select "aXbc" REGEXP "^[a-dXYZ]$";       -> 0
          mysql> select "aXbc" REGEXP "^[a-dXYZ]+$";      -> 1
          mysql> select "aXbc" REGEXP "^[^a-dXYZ]+$";     -> 0
          mysql> select "gheis" REGEXP "^[^a-dXYZ]+$";    -> 1
          mysql> select "gheisa" REGEXP "^[^a-dXYZ]+$";   -> 0

`[[.characters.]]'
     The sequence of characters of that collating element. The sequence
     is a single element of the bracket expression's list. A bracket
     expression containing a multi-character collating element can thus
     match more than one character, for example, if the collating
     sequence includes a `ch' collating element, then the regular
     expression `[[.ch.]]*c' matches the first five characters of
     `chchcc'.

`[=character_class=]'
     An equivalence class, standing for the sequences of characters of
     all collating elements equivalent to that one, including itself.

     For example, if `o' and `(+)' are the members of an equivalence
     class, then `[[=o=]]', `[[=(+)=]]', and `[o(+)]' are all
     synonymous. An equivalence class may not be an endpoint of a range.

`[:character_class:]'
     Within a bracket expression, the name of a character class
     enclosed in `[:' and `:]' stands for the list of all characters
     belonging to that class. Standard character class names are:

     alnum                  digit                  punct
     alpha                  graph                  space
     blank                  lower                  upper
     cntrl                  print                  xdigit

     These stand for the character classes defined in the `ctype(3)'
     manual page. A locale may provide others. A character class may
     not be used as an endpoint of a range.
          mysql> select "justalnums" REGEXP "[[:alnum:]]+";       -> 1
          mysql> select "!!" REGEXP "[[:alnum:]]+";               -> 0

`[[:<:]]'
`[[:>:]]'
     These match the null string at the beginning and end of a word
     respectively.  A word is defined as a sequence of word characters
     which is neither preceded nor followed by word characters. A word
     character is an alnum character (as defined by `ctype(3)') or an
     underscore (`_').
          mysql> select "a word a" REGEXP "[[:<:]]word[[:>:]]";      -> 1
          mysql> select "a xword a" REGEXP "[[:<:]]word[[:>:]]";     -> 0

     mysql> select "weeknights" REGEXP "^(wee|week)(knights|nights)$"; -> 1


automatically generated by info2www version 1.2.2.9