Whole document tree

Whole document tree

GNU macro processor - Introduction and preliminaries Go to the first, previous, next, last section, table of contents.

Copyright (C) 1989, 90, 91, 92, 93, 94 Free Software Foundation, Inc.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.

Introduction and preliminaries

This first chapter explains what is GNU m4, where m4 comes from, how to read and use this documentation, how to call the m4 program and how to report bugs about it. It concludes by giving tips for reading the remainder of the manual.

The following chapters then detail all the features of the m4 language.

Introduction to m4

m4 is a macro processor, in the sense that it copies its input to the output, expanding macros as it goes. Macros are either builtin or user-defined, and can take any number of arguments. Besides just doing macro expansion, m4 has builtin functions for including named files, running UNIX commands, doing integer arithmetic, manipulating text in various ways, recursion, etc... m4 can be used either as a front-end to a compiler, or as a macro processor in its own right.

The m4 macro processor is widely available on all UNIXes. Usually, only a small percentage of users are aware of its existence. However, those who do often become commited users. The growing popularity of GNU Autoconf, which prerequires GNU m4 for generating the `configure' scripts, is an incentive for many to install it, while these people will not themselves program in m4. GNU m4 is mostly compatible with the System V, Release 3 version, except for some minor differences. See section Compatibility with other versions of m4 for more details.

Some people found m4 to be fairly addictive. They first use m4 for simple problems, then take bigger and bigger challenges, learning how to write complex m4 sets of macros along the way. Once really addicted, users pursue writing of sophisticated m4 applications even to solve simple problems, devoting more time debugging their m4 scripts than doing real work. Beware that m4 may be dangerous for the health of compulsive programmers.

Historical references

The historical notes included here are fairly incomplete, and not authoritative at all. Please knowledgeable users help us to more properly write this section.

GPM has been an important ancestor of m4. See C. Stratchey: "A General Purpose Macro generator", Computer Journal 8,3 (1965), pp. 225 ff. GPM is also succintly described into David Gries classic "Compiler Construction for Digital Computers".

While GPM was pure, m4 was meant to deal more with the true intricacies of real life: macros could be recognized with being pre-announced, skipping whitespace or end-of-lines was made easier, more constructs were builtin instead of derived, etc.

Originally, m4 was the engine for Rational FORTRAN preprocessor, that is, the ratfor equivalent of cpp.

Invoking m4

The format of the m4 command is:

m4 [option...] [macro-definitions...] [input-file...]

All options begin with `-', or if long option names are used, with a `--'. A long option name need not be written completely, and unambigous prefix is sufficient. m4 understands the following options:

Print the version number of the program on standard output, then immediately exit m4 without reading any input-files.
Print an help summary on standard output, then immediately exit m4 without reading any input-files.
Suppress all the extensions made in this implementation, compared to the System V version. See section Compatibility with other versions of m4, for a list of these.
Stop execution and exit m4 once the first warning has been issued, considering all of them to be fatal.
Set the debug-level according to the flags flags. The debug-level controls the format and amount of information presented by the debugging functions. See section Controlling debugging output for more details on the format and meaning of flags.
Restrict the size of the output generated by macro tracing. See section Controlling debugging output for more details.
Redirect debug and trace output to the named file. Error messages are still printed on the standard error output. See section Saving debugging output for more details.
Make m4 search dir for included files that are not found in the current working directory. See section Searching for include files for more details.
Makes this invocation of m4 interactive. This means that all output will be unbuffered, and interrupts will be ignored.
Generate synchronisation lines, for use by the C preprocessor or other similar tools. This is useful, for example, when m4 is used as a front end to a compiler. Source file name and line number information is conveyed by directives of the form `#line linenum "filename"', which are inserted as needed into the middle of the input. Such directives mean that the following line originated or was expanded from the contents of input file filename at line linenum. The `"filename"' part is often omitted when the file name did not change from the previous directive. Synchronisation directives are always given on complete lines per themselves. When a synchronisation discrepancy occurs in the middle of an output line, the associated synchronisation directive is delayed until the beginning of the next generated line.
Internally modify all builtin macro names so they all start with the prefix `m4_'. For example, using this option, one should write `m4_define' instead of `define', and `m4___file__' instead of `__file__'.
Use an alternative syntax for macro names. This experimental option might not be present on all GNU m4 implementations. (see section Changing the lexical structure of words).
Make the internal hash table for symbol lookup be n entries big. The number should be prime. The default is 509 entries. It should not be necessary to increase this value, unless you define an excessive number of macros.
Artificially limit the nesting of macro calls to n levels, stopping program execution if this limit is ever exceeded. When not specified, nesting is limited to 250 levels. The precise effect of this option might be more correctly associated with textual nesting than dynamic recursion. It has been useful when some complex m4 input was generated by mechanical means. Most users would never need this option. If shown to be obtrusive, this option (which is still experimental) might well disappear. This option does not have the ability to break endless rescanning loops, while these do not necessarily consume much memory or stack space. Through clever usage of rescanning loops, one can request complex, time-consuming computations to m4 with useful results. Putting limitations in this area would break m4 power. There are many pathological cases: `define(`a', `a')a' is only the simplest example (but see section Compatibility with other versions of m4). Expecting GNU m4 to detect these would be a little like expecting a compiler system to detect and diagnose endless loops: it is a quite hard problem in general, if not undecidable!
Suppress warnings about missing or superflous arguments in macro calls.
These options are present for compatibility with System V m4, but do nothing in this implementation.
These options are present only for compatibility with previous versions of GNU m4, and were controlling the number of possible diversions which could be used at the same time. They do nothing, because there is no fixed limit anymore.

Macro definitions and deletions can be made on the command line, by using the `-D' and `-U' options. They have the following format:

This enters name into the symbol table, before any input files are read. If `=value' is missing, the value is taken to be the empty string. The value can be any string, and the macro can be defined to take arguments, just as if it was defined from within the input.
This deletes any predefined meaning name might have. Obviously, only predefined macros can be deleted in this way.
This enters name into the symbol table, as undefined but traced. The macro will consequently be traced from the point it is defined.
--freeze-state file
Once execution is finished, write out the frozen state on the specified file (see section Fast loading of frozen states).
--reload-state file
Before execution starts, recover the internal state from the specified frozen file (see section Fast loading of frozen states).

The remaining arguments on the command line are taken to be input file names. If no names are present, the standard input is read. A file name of `-' is taken to mean the standard input.

The input files are read in the sequence given. The standard input can only be read once, so the filename `-' should only appear once on the command line.

Problems and bugs

If you have problems with GNU m4 or think you've found a bug, please report it. Before reporting a bug, make sure you've actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!

Before reporting a bug or trying to fix it yourself, try to isolate it to the smallest possible input file that reproduces the problem. Then send us the input file and the exact results m4 gave you. Also say what you expected to occur; this will help us decide whether the problem was really in the documentation.

Once you've got a precise problem, send e-mail to (Internet) `bug-gnu-utils@prep.ai.mit.edu' or (UUCP) `mit-eddie!prep.ai.mit.edu!bug-gnu-utils'. Please include the version number of m4 you are using. You can get this information with the command `m4 --version'.

Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, please report them too.

Using this manual

This manual contains a number of examples of m4 input and output, and a simple notation is used to distinguish input, output and error messages from m4. Examples are set out from the normal text, and shown in a fixed width font, like this

This is an example of an example!

To distinguish input from output, all output from m4 is prefixed by the string `=>', and all error messages by the string `error-->'. Thus

Example of input line
=>Output line from m4
error-->and an error message

As each of the predefined macros in m4 is described, a prototype call of the macro will be shown, giving descriptive names to the arguments, e.g.,

regexp(string, regexp, opt replacement)

All macro arguments in m4 are strings, but some are given special interpretation, e.g., as numbers, filenames, regular expressions, etc.

The `opt' before the third argument shows that this argument is optional--if it is left out, it is taken to be the empty string. An ellipsis (`...') last in the argument list indicates that any number of arguments may follow.

This document consistently writes and uses builtin, without an hyphen, as if it were an English word. This is how the builtin primitive is spelled within m4.

Go to the first, previous, next, last section, table of contents.