Copyright (C) 2000-2012 |
GNU Info (mysql.info)RegexpDescription of MySQL regular expression syntax ********************************************** A regular expression (regex) is a powerful way of specifying a complex search. MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the extended version. This is a simplistic reference that skips the details. To get more exact information, see Henry Spencer's `regex(7)' manual page that is included in the source distribution. Note: Credits. A regular expression describes a set of strings. The simplest regexp is one that has no special characters in it. For example, the regexp `hello' matches `hello' and nothing else. Non-trivial regular expressions use certain special constructs so that they can match more than one string. For example, the regexp `hello|word' matches either the string `hello' or the string `word'. As a more complex example, the regexp `B[an]*s' matches any of the strings `Bananas', `Baaaaas', `Bs', and any other string starting with a `B', ending with an `s', and containing any number of `a' or `n' characters in between. A regular expression may use any of the following special characters/constructs: `^' Match the beginning of a string. mysql> select "fo\nfo" REGEXP "^fo$"; -> 0 mysql> select "fofo" REGEXP "^fo"; -> 1 `$' Match the end of a string. mysql> select "fo\no" REGEXP "^fo\no$"; -> 1 mysql> select "fo\no" REGEXP "^fo$"; -> 0 `.' Match any character (including newline). mysql> select "fofo" REGEXP "^f.*"; -> 1 mysql> select "fo\nfo" REGEXP "^f.*"; -> 1 `a*' Match any sequence of zero or more `a' characters. mysql> select "Ban" REGEXP "^Ba*n"; -> 1 mysql> select "Baaan" REGEXP "^Ba*n"; -> 1 mysql> select "Bn" REGEXP "^Ba*n"; -> 1 `a+' Match any sequence of one or more `a' characters. mysql> select "Ban" REGEXP "^Ba+n"; -> 1 mysql> select "Bn" REGEXP "^Ba+n"; -> 0 `a?' Match either zero or one `a' character. mysql> select "Bn" REGEXP "^Ba?n"; -> 1 mysql> select "Ban" REGEXP "^Ba?n"; -> 1 mysql> select "Baan" REGEXP "^Ba?n"; -> 0 `de|abc' Match either of the sequences `de' or `abc'. mysql> select "pi" REGEXP "pi|apa"; -> 1 mysql> select "axe" REGEXP "pi|apa"; -> 0 mysql> select "apa" REGEXP "pi|apa"; -> 1 mysql> select "apa" REGEXP "^(pi|apa)$"; -> 1 mysql> select "pi" REGEXP "^(pi|apa)$"; -> 1 mysql> select "pix" REGEXP "^(pi|apa)$"; -> 0 `(abc)*' Match zero or more instances of the sequence `abc'. mysql> select "pi" REGEXP "^(pi)*$"; -> 1 mysql> select "pip" REGEXP "^(pi)*$"; -> 0 mysql> select "pipi" REGEXP "^(pi)*$"; -> 1 `{1}' `{2,3}' The is a more general way of writing regexps that match many occurrences of the previous atom. `a*' Can be written as `a{0,}'. `a+' Can be written as `a{1,}'. `a?' Can be written as `a{0,1}'. To be more precise, an atom followed by a bound containing one integer `i' and no comma matches a sequence of exactly `i' matches of the atom. An atom followed by a bound containing one integer `i' and a comma matches a sequence of `i' or more matches of the atom. An atom followed by a bound containing two integers `i' and `j' matches a sequence of `i' through `j' (inclusive) matches of the atom. Both arguments must be in the range from `0' to `RE_DUP_MAX' (default 255), inclusive. If there are two arguments, the second must be greater than or equal to the first. `[a-dX]' `[^a-dX]' Matches any character which is (or is not, if ^ is used) either `a', `b', `c', `d' or `X'. To include a literal `]' character, it must immediately follow the opening bracket `['. To include a literal `-' character, it must be written first or last. So `[0-9]' matches any decimal digit. Any character that does not have a defined meaning inside a `[]' pair has no special meaning and matches only itself. mysql> select "aXbc" REGEXP "[a-dXYZ]"; -> 1 mysql> select "aXbc" REGEXP "^[a-dXYZ]$"; -> 0 mysql> select "aXbc" REGEXP "^[a-dXYZ]+$"; -> 1 mysql> select "aXbc" REGEXP "^[^a-dXYZ]+$"; -> 0 mysql> select "gheis" REGEXP "^[^a-dXYZ]+$"; -> 1 mysql> select "gheisa" REGEXP "^[^a-dXYZ]+$"; -> 0 `[[.characters.]]' The sequence of characters of that collating element. The sequence is a single element of the bracket expression's list. A bracket expression containing a multi-character collating element can thus match more than one character, for example, if the collating sequence includes a `ch' collating element, then the regular expression `[[.ch.]]*c' matches the first five characters of `chchcc'. `[=character_class=]' An equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. For example, if `o' and `(+)' are the members of an equivalence class, then `[[=o=]]', `[[=(+)=]]', and `[o(+)]' are all synonymous. An equivalence class may not be an endpoint of a range. `[:character_class:]' Within a bracket expression, the name of a character class enclosed in `[:' and `:]' stands for the list of all characters belonging to that class. Standard character class names are: alnum digit punct alpha graph space blank lower upper cntrl print xdigit These stand for the character classes defined in the `ctype(3)' manual page. A locale may provide others. A character class may not be used as an endpoint of a range. mysql> select "justalnums" REGEXP "[[:alnum:]]+"; -> 1 mysql> select "!!" REGEXP "[[:alnum:]]+"; -> 0 `[[:<:]]' `[[:>:]]' These match the null string at the beginning and end of a word respectively. A word is defined as a sequence of word characters which is neither preceded nor followed by word characters. A word character is an alnum character (as defined by `ctype(3)') or an underscore (`_'). mysql> select "a word a" REGEXP "[[:<:]]word[[:>:]]"; -> 1 mysql> select "a xword a" REGEXP "[[:<:]]word[[:>:]]"; -> 0 mysql> select "weeknights" REGEXP "^(wee|week)(knights|nights)$"; -> 1 automatically generated by info2www version 1.2.2.9 |