GNU Info

Info Node: (gawk.info)Floating Point Issues

(gawk.info)Floating Point Issues


Prev: Basic Data Typing Up: Basic Concepts
Enter node , (file) or (file)node

Floating-Point Number Caveats
=============================

   As mentioned earlier, floating-point numbers represent what are
called "real" numbers; i.e., those that have a fractional part.  `awk'
uses double-precision floating-point numbers to represent all numeric
values.  This minor node describes some of the issues involved in using
floating-point numbers.

   There is a very nice paper on floating-point arithmetic by David
Goldberg, `What Every Computer Scientist Should Know About
Floating-point Arithmetic', `ACM Computing Surveys' *23*, 1 (1991-03),
5-48.(1) This is worth reading if you are interested in the details,
but it does require a background in Computer Science.

   Internally, `awk' keeps both the numeric value (double-precision
floating-point) and the string value for a variable.  Separately, `awk'
keeps track of what type the variable has (Note: Variable Typing and
Comparison Expressions.), which plays a role in
how variables are used in comparisons.

   It is important to note that the string value for a number may not
reflect the full value (all the digits) that the numeric value actually
contains.  The following program (`values.awk') illustrates this:

     {
        $1 = $2 + $3
        # see it for what it is
        printf("$1 = %.12g\n", $1)
        # use CONVFMT
        a = "<" $1 ">"
        print "a =", a
        # use OFMT
        print "$1 =", $1
     }

This program shows the full value of the sum of `$2' and `$3' using
`printf', and then prints the string values obtained from both
automatic conversion (via `CONVFMT') and from printing (via `OFMT').

   Here is what happens when the program is run:

     $ echo 2 3.654321 1.2345678 | awk -f values.awk
     -| $1 = 4.8888888
     -| a = <4.88889>
     -| $1 = 4.88889

   This makes it clear that the full numeric value is different from
what the default string representations show.

   `CONVFMT''s default value is `"%.6g"', which yields a value with at
least six significant digits.  For some applications, you might want to
change it to specify more precision.  On most modern machines, most of
the time, 17 digits is enough to capture a floating-point number's
value exactly.(2)

   Unlike numbers in the abstract sense (such as what you studied in
high school or college math), numbers stored in computers are limited
in certain ways.  They cannot represent an infinite number of digits,
nor can they always represent things exactly.  In particular,
floating-point numbers cannot always represent values exactly.  Here is
an example:

     $ awk '{ printf("%010d\n", $1 * 100) }'
     515.79
     -| 0000051579
     515.80
     -| 0000051579
     515.81
     -| 0000051580
     515.82
     -| 0000051582
     Ctrl-d

This shows that some values can be represented exactly, whereas others
are only approximated.  This is not a "bug" in `awk', but simply an
artifact of how computers represent numbers.

   Another peculiarity of floating-point numbers on modern systems is
that they often have more than one representation for the number zero!
In particular, it is possible to represent "minus zero" as well as
regular, or "positive" zero.

   This example shows that negative and positive zero are distinct
values when stored internally, but that they are in fact equal to each
other, as well as to "regular" zero:

     $ gawk 'BEGIN { mz = -0 ; pz = 0
     > printf "-0 = %g, +0 = %g, (-0 == +0) -> %d\n", mz, pz, mz == pz
     > printf "mz == 0 -> %d, pz == 0 -> %d\n", mz == 0, pz == 0
     > }'
     -| -0 = -0, +0 = 0, (-0 == +0) -> 1
     -| mz == 0 -> 1, pz == 0 -> 1

   It helps to keep this in mind should you process numeric data that
contains negative zero values; the fact that the zero is negative is
noted and can affect comparisons.

   ---------- Footnotes ----------

   (1) `http://www.validgh.com/goldberg/paper.ps'

   (2) Pathological cases can require up to 752 digits (!), but we
doubt that you need to worry about this.


automatically generated by info2www version 1.2.2.9