GNU Info

Info Node: (gawk.info)Leftmost Longest

(gawk.info)Leftmost Longest


Next: Computed Regexps Prev: Case-sensitivity Up: Regexp
Enter node , (file) or (file)node

How Much Text Matches?
======================

   Consider the following:

     echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'

   This example uses the `sub' function (which we haven't discussed yet;
Note: String Manipulation Functions.)  to make a
change to the input record. Here, the regexp `/a+/' indicates "one or
more `a' characters," and the replacement text is `<A>'.

   The input contains four `a' characters.  `awk' (and POSIX) regular
expressions always match the leftmost, _longest_ sequence of input
characters that can match.  Thus, all four `a' characters are replaced
with `<A>' in this example:

     $ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
     -| <A>bcd

   For simple match/no-match tests, this is not so important. But when
doing text matching and substitutions with the `match', `sub', `gsub',
and `gensub' functions, it is very important.  Note: String
Manipulation Functions, for more information on these
functions.  Understanding this principle is also important for
regexp-based record and field splitting (Note: How Input Is Split into
Records., and also *note Specifying How Fields Are Separated:
Field Separators.).


automatically generated by info2www version 1.2.2.9