Examining Fields
================
When `awk' reads an input record, the record is automatically
separated or "parsed" by the interpreter into chunks called "fields".
By default, fields are separated by "whitespace", like words in a line.
Whitespace in `awk' means any string of one or more spaces, tabs, or
newlines;(1) other characters, such as formfeed, vertical tab, etc.
that are considered whitespace by other languages, are _not_ considered
whitespace by `awk'.
The purpose of fields is to make it more convenient for you to refer
to these pieces of the record. You don't have to use them--you can
operate on the whole record if you want--but fields are what make
simple `awk' programs so powerful.
A dollar-sign (`$') is used to refer to a field in an `awk' program,
followed by the number of the field you want. Thus, `$1' refers to the
first field, `$2' to the second, and so on. (Unlike the Unix shells,
the field numbers are not limited to single digits. `$127' is the one
hundred and twenty-seventh field in the record.) For example, suppose
the following is a line of input:
This seems like a pretty nice example.
Here the first field, or `$1', is `This', the second field, or `$2', is
`seems', and so on. Note that the last field, `$7', is `example.'.
Because there is no space between the `e' and the `.', the period is
considered part of the seventh field.
`NF' is a built-in variable whose value is the number of fields in
the current record. `awk' automatically updates the value of `NF' each
time it reads a record. No matter how many fields there are, the last
field in a record can be represented by `$NF'. So, `$NF' is the same
as `$7', which is `example.'. If you try to reference a field beyond
the last one (such as `$8' when the record has only seven fields), you
get the empty string. (If used in a numeric operation, you get zero.)
The use of `$0', which looks like a reference to the "zeroth" field,
is a special case: it represents the whole input record when you are
not interested in specific fields. Here are some more examples:
$ awk '$1 ~ /foo/ { print $0 }' BBS-list
-| fooey 555-1234 2400/1200/300 B
-| foot 555-6699 1200/300 B
-| macfoo 555-6480 1200/300 A
-| sabafoo 555-2127 1200/300 C
This example prints each record in the file `BBS-list' whose first
field contains the string `foo'. The operator `~' is called a
"matching operator" (Note:How to Use Regular Expressions.
); it tests whether a string (here, the field `$1') matches a
given regular expression.
By contrast, the following example looks for `foo' in _the entire
record_ and prints the first field and the last field for each matching
input record:
$ awk '/foo/ { print $1, $NF }' BBS-list
-| fooey B
-| foot B
-| macfoo A
-| sabafoo C
---------- Footnotes ----------
(1) In POSIX `awk', newlines are not considered whitespace for
separating fields.