Copyright (C) 2000-2012 |
GNU Info (m4.info)ChangewordChanging the lexical structure of words ======================================= The macro `changeword' and all associated functionnality is experimental. It is only available if the `--enable-changeword' option was given to `configure', at GNU `m4' installation time. The functionnality might change or even go away in the future. _Do not rely on it_. Please direct your comments about it the same way you would do for bugs. A file being processed by `m4' is split into quoted strings, words (potential macro names) and simple tokens (any other single character). Initially a word is defined by the following regular expression: [_a-zA-Z][_a-zA-Z0-9]* Using `changeword', you can change this regular expression. Relaxing `m4''s lexical rules might be useful (for example) if you wanted to apply translations to a file of numbers: changeword(`[_a-zA-Z0-9]+') define(1, 0) =>1 Tightening the lexical rules is less useful, because it will generally make some of the builtins unavailable. You could use it to prevent accidental call of builtins, for example: define(`_indir', defn(`indir')) changeword(`_[_a-zA-Z0-9]*') esyscmd(foo) _indir(`esyscmd', `ls') Because `m4' constructs its words a character at a time, there is a restriction on the regular expressions that may be passed to `changeword'. This is that if your regular expression accepts `foo', it must also accept `f' and `fo'. `changeword' has another function. If the regular expression supplied contains any bracketed subexpressions, then text outside the first of these is discarded before symbol lookup. So: changecom(`/*', `*/') changeword(`#\([_a-zA-Z0-9]*\)') #esyscmd(ls) `m4' now requires a `#' mark at the beginning of every macro invocation, so one can use `m4' to preprocess shell scripts without getting `shift' commands swallowed, and plain text without losing various common words. `m4''s macro substitution is based on text, while TeX's is based on tokens. `changeword' can throw this difference into relief. For example, here is the same idea represented in TeX and `m4'. First, the TeX version: \def\a{\message{Hello}} \catcode`\@=0 \catcode`\\=12 =>@a =>@bye Then, the `m4' version: define(a, `errprint(`Hello')') changeword(`@\([_a-zA-Z0-9]*\)') =>@a In the TeX example, the first line defines a macro `a' to print the message `Hello'. The second line defines <@> to be usable instead of <\> as an escape character. The third line defines <\> to be a normal printing character, not an escape. The fourth line invokes the macro `a'. So, when TeX is run on this file, it displays the message `Hello'. When the `m4' example is passed through `m4', it outputs `errprint(Hello)'. The reason for this is that TeX does lexical analysis of macro definition when the macro is _defined_. `m4' just stores the text, postponing the lexical analysis until the macro is _used_. You should note that using `changeword' will slow `m4' down by a factor of about seven. automatically generated by info2www version 1.2.2.9 |