GNU Info

Info Node: (gawk.info)Scanning an Array

(gawk.info)Scanning an Array


Next: Delete Prev: Array Example Up: Arrays
Enter node , (file) or (file)node

Scanning All Elements of an Array
=================================

   In programs that use arrays, it is often necessary to use a loop that
executes once for each element of an array.  In other languages, where
arrays are contiguous and indices are limited to positive integers,
this is easy: all the valid indices can be found by counting from the
lowest index up to the highest.  This technique won't do the job in
`awk', because any number or string can be an array index.  So `awk'
has a special kind of `for' statement for scanning an array:

     for (VAR in ARRAY)
       BODY

This loop executes BODY once for each index in ARRAY that the program
has previously used, with the variable VAR set to that index.

   The following program uses this form of the `for' statement.  The
first rule scans the input records and notes which words appear (at
least once) in the input, by storing a one into the array `used' with
the word as index.  The second rule scans the elements of `used' to
find all the distinct words that appear in the input.  It prints each
word that is more than 10 characters long and also prints the number of
such words.  Note: String Manipulation Functions, for
more information on the built-in function `length'.

     # Record a 1 for each word that is used at least once
     {
         for (i = 1; i <= NF; i++)
             used[$i] = 1
     }
     
     # Find number of distinct words more than 10 characters long
     END {
         for (x in used)
             if (length(x) > 10) {
                 ++num_long_words
                 print x
             }
         print num_long_words, "words longer than 10 characters"
     }

Note: Generating Word Usage Counts, for a more detailed
example of this type.

   The order in which elements of the array are accessed by this
statement is determined by the internal arrangement of the array
elements within `awk' and cannot be controlled or changed.  This can
lead to problems if new elements are added to ARRAY by statements in
the loop body; it is not predictable whether or not the `for' loop will
reach them.  Similarly, changing VAR inside the loop may produce
strange results.  It is best to avoid such things.


automatically generated by info2www version 1.2.2.9