Copyright (C) 2000-2012 |
GNU Info (libc.info)Floating Point NumbersFloating Point Numbers ====================== Most computer hardware has support for two different kinds of numbers: integers (...-3, -2, -1, 0, 1, 2, 3...) and floating-point numbers. Floating-point numbers have three parts: the "mantissa", the "exponent", and the "sign bit". The real number represented by a floating-point value is given by (s ? -1 : 1) * 2^e * M where s is the sign bit, e the exponent, and M the mantissa. Note: Floating Point Concepts, for details. (It is possible to have a different "base" for the exponent, but all modern hardware uses 2.) Floating-point numbers can represent a finite subset of the real numbers. While this subset is large enough for most purposes, it is important to remember that the only reals that can be represented exactly are rational numbers that have a terminating binary expansion shorter than the width of the mantissa. Even simple fractions such as 1/5 can only be approximated by floating point. Mathematical operations and functions frequently need to produce values that are not representable. Often these values can be approximated closely enough for practical purposes, but sometimes they can't. Historically there was no way to tell when the results of a calculation were inaccurate. Modern computers implement the IEEE 754 standard for numerical computations, which defines a framework for indicating to the program when the results of calculation are not trustworthy. This framework consists of a set of "exceptions" that indicate why a result could not be represented, and the special values "infinity" and "not a number" (NaN). automatically generated by info2www version 1.2.2.9 |