Info Node: (ed.info)Regular expressions

www.fifi.org
    Documentation
        Manpages
        GNU Info
        Debian document tree
        Whole document tree
    Trigance web page
    Public services
    User info
    Mailing lists
    Secure server
    Multilingual usage

Validate HTML
Validate CSS

(ed.info)Regular expressions

Regular expressions ******************* Regular expressions are patterns used in selecting text. For example, the `ed' command g/STRING/ prints all lines containing STRING. Regular expressions are also used by the `s' command for selecting old text to be replaced with new. In addition to a specifying string literals, regular expressions can represent classes of strings. Strings thus represented are said to be matched by the corresponding regular expression. If it is possible for a regular expression to match several strings in a line, then the left-most longest match is the one selected. The following symbols are used in constructing regular expressions: `C' Any character C not listed below, including `{', `}', `(', `)', `<' and `>', matches itself. `\C' Any backslash-escaped character C, other than `{', ``}', `(', `)', `<', `>', `b', `B', `w', `W', `+' and `?', matches itself. `.' Matches any single character. `[CHAR-CLASS]' Matches any single character in CHAR-CLASS. To include a `]' in CHAR-CLASS, it must be the first character. A range of characters may be specified by separating the end characters of the range with a `-', e.g., `a-z' specifies the lower case characters. The following literal expressions can also be used in CHAR-CLASS to specify sets of characters: [:alnum:] [:cntrl:] [:lower:] [:space:] [:alpha:] [:digit:] [:print:] [:upper:] [:blank:] [:graph:] [:punct:] [:xdigit:] If `-' appears as the first or last character of CHAR-CLASS, then it matches itself. All other characters in CHAR-CLASS match themselves. Patterns in CHAR-CLASS of the form: [.COL-ELM.] [=COL-ELM=] where COL-ELM is a "collating element" are interpreted according to `locale (5)' (not currently supported). See `regex (3)' for an explanation of these constructs. `[^CHAR-CLASS]' Matches any single character, other than newline, not in CHAR-CLASS. CHAR-CLASS is defined as above. `^' If `^' is the first character of a regular expression, then it anchors the regular expression to the beginning of a line. Otherwise, it matches itself. `$' If `$' is the last character of a regular expression, it anchors the regular expression to the end of a line. Otherwise, it matches itself. `$RE$' Defines a (possibly null) subexpression RE. Subexpressions may be nested. A subsequent backreference of the form `\N', where N is a number in the range [1,9], expands to the text matched by the Nth subexpression. For example, the regular expression `$a.c$\1' matches the string `abcabc', but not `abcadc'. Subexpressions are ordered relative to their left delimiter. `*' Matches the single character regular expression or subexpression immediately preceding it zero or more times. If `*' is the first character of a regular expression or subexpression, then it matches itself. The `*' operator sometimes yields unexpected results. For example, the regular expression `b*' matches the beginning of the string `abbb', as opposed to the substring `bbb', since a null match is the only left-most match. `\{N,M\}' `\{N,\}' `\{N\}' Matches the single character regular expression or subexpression immediately preceding it at least N and at most M times. If M is omitted, then it matches at least N times. If the comma is also omitted, then it matches exactly N times. If any of these forms occurs first in a regular expression or subexpression, then it is interpreted literally (i.e., the regular expression `\{2\}' matches the string `{2}', and so on). `\<' `\>' Anchors the single character regular expression or subexpression immediately following it to the beginning (in the case of `\<') or ending (in the case of `\>') of a "word", i.e., in ASCII, a maximal string of alphanumeric characters, including the underscore (_). The following extended operators are preceded by a backslash `\' to distinguish them from traditional `ed' syntax. `\`' `\'' Unconditionally matches the beginning `\`' or ending `\'' of a line. `\?' Optionally matches the single character regular expression or subexpression immediately preceding it. For example, the regular expression `a[bd]\?c' matches the strings `abc', `adc' and `ac'. If `\?' occurs at the beginning of a regular expressions or subexpression, then it matches a literal `?'. `\+' Matches the single character regular expression or subexpression immediately preceding it one or more times. So the regular expression `a+' is shorthand for `aa*'. If `\+' occurs at the beginning of a regular expression or subexpression, then it matches a literal `+'. `\b' Matches the beginning or ending (null string) of a word. Thus the regular expression `\bhello\b' is equivalent to `\<hello\>'. However, `\b\b' is a valid regular expression whereas `\<\>' is not. `\B' Matches (a null string) inside a word. `\w' Matches any character in a word. `\W' Matches any character not in a word.

automatically generated by

info2www

version 1.2.2.9