Copyright (C) 2000-2012 |
GNU Info (gawk.info)Array IntroIntroduction to Arrays ====================== The `awk' language provides one-dimensional arrays for storing groups of related strings or numbers. Every `awk' array must have a name. Array names have the same syntax as variable names; any valid variable name would also be a valid array name. But one name cannot be used in both ways (as an array and as a variable) in the same `awk' program. Arrays in `awk' superficially resemble arrays in other programming languages, but there are fundamental differences. In `awk', it isn't necessary to specify the size of an array before starting to use it. Additionally, any number or string in `awk', not just consecutive integers, may be used as an array index. In most other languages, arrays must be "declared" before use, including a specification of how many elements or components they contain. In such languages, the declaration causes a contiguous block of memory to be allocated for that many elements. Usually, an index in the array must be a positive integer. For example, the index zero specifies the first element in the array, which is actually stored at the beginning of the block of memory. Index one specifies the second element, which is stored in memory right after the first element, and so on. It is impossible to add more elements to the array, because it has room only for as many elements as given in the declaration. (Some languages allow arbitrary starting and ending indices--e.g., `15 .. 27'--but the size of the array is still fixed when the array is declared.) A contiguous array of four elements might look like the following example, conceptually, if the element values are 8, `"foo"', `""', and 30: +---------+---------+--------+---------+ | 8 | "foo" | "" | 30 | Value +---------+---------+--------+---------+ 0 1 2 3 Index Only the values are stored; the indices are implicit from the order of the values. 8 is the value at index zero, because 8 appears in the position with zero elements before it. Arrays in `awk' are different--they are "associative". This means that each array is a collection of pairs: an index, and its corresponding array element value: Element 3 Value 30 Element 1 Value "foo" Element 0 Value 8 Element 2 Value "" The pairs are shown in jumbled order because their order is irrelevant. One advantage of associative arrays is that new pairs can be added at any time. For example, suppose a tenth element is added to the array whose value is `"number ten"'. The result is: Element 10 Value "number ten" Element 3 Value 30 Element 1 Value "foo" Element 0 Value 8 Element 2 Value "" Now the array is "sparse", which just means some indices are missing. It has elements 0-3 and 10, but doesn't have elements 4, 5, 6, 7, 8, or 9. Another consequence of associative arrays is that the indices don't have to be positive integers. Any number, or even a string, can be an index. For example, the following is an array that translates words from English into French: Element "dog" Value "chien" Element "cat" Value "chat" Element "one" Value "un" Element 1 Value "un" Here we decided to translate the number one in both spelled-out and numeric form--thus illustrating that a single array can have both numbers and strings as indices. In fact, array subscripts are always strings; this is discussed in more detail in Note: Using Numbers to Subscript Arrays. Here, the number `1' isn't double-quoted, since `awk' automatically converts it to a string. The value of `IGNORECASE' has no effect upon array subscripting. The identical string value used to store an array element must be used to retrieve it. When `awk' creates an array (e.g., with the `split' built-in function), that array's indices are consecutive integers starting at one. (Note: String Manipulation Functions. ) `awk''s arrays are efficient--the time to access an element is independent of the number of elements in the array. automatically generated by info2www version 1.2.2.9 |