Expressions
===========
The internal representation for expressions is for the most part
quite straightforward. However, there are a few facts that one must
bear in mind. In particular, the expression "tree" is actually a
directed acyclic graph. (For example there may be many references to
the integer constant zero throughout the source program; many of these
will be represented by the same expression node.) You should not rely
on certain kinds of node being shared, nor should rely on certain kinds
of nodes being unshared.
The following macros can be used with all expression nodes:
`TREE_TYPE'
Returns the type of the expression. This value may not be
precisely the same type that would be given the expression in the
original program.
In what follows, some nodes that one might expect to always have type
`bool' are documented to have either integral or boolean type. At some
point in the future, the C front end may also make use of this same
intermediate representation, and at this point these nodes will
certainly have integral type. The previous sentence is not meant to
imply that the C++ front end does not or will not give these nodes
integral type.
Below, we list the various kinds of expression nodes. Except where
noted otherwise, the operands to an expression are accessed using the
`TREE_OPERAND' macro. For example, to access the first operand to a
binary plus expression `expr', use:
TREE_OPERAND (expr, 0)
As this example indicates, the operands are zero-indexed.
The table below begins with constants, moves on to unary expressions,
then proceeds to binary expressions, and concludes with various other
kinds of expressions:
`INTEGER_CST'
These nodes represent integer constants. Note that the type of
these constants is obtained with `TREE_TYPE'; they are not always
of type `int'. In particular, `char' constants are represented
with `INTEGER_CST' nodes. The value of the integer constant `e' is
given by
((TREE_INT_CST_HIGH (e) << HOST_BITS_PER_WIDE_INT)
+ TREE_INST_CST_LOW (e))
HOST_BITS_PER_WIDE_INT is at least thirty-two on all platforms.
Both `TREE_INT_CST_HIGH' and `TREE_INT_CST_LOW' return a
`HOST_WIDE_INT'. The value of an `INTEGER_CST' is interpreted as
a signed or unsigned quantity depending on the type of the
constant. In general, the expression given above will overflow,
so it should not be used to calculate the value of the constant.
The variable `integer_zero_node' is an integer constant with value
zero. Similarly, `integer_one_node' is an integer constant with
value one. The `size_zero_node' and `size_one_node' variables are
analogous, but have type `size_t' rather than `int'.
The function `tree_int_cst_lt' is a predicate which holds if its
first argument is less than its second. Both constants are
assumed to have the same signedness (i.e., either both should be
signed or both should be unsigned.) The full width of the
constant is used when doing the comparison; the usual rules about
promotions and conversions are ignored. Similarly,
`tree_int_cst_equal' holds if the two constants are equal. The
`tree_int_cst_sgn' function returns the sign of a constant. The
value is `1', `0', or `-1' according on whether the constant is
greater than, equal to, or less than zero. Again, the signedness
of the constant's type is taken into account; an unsigned constant
is never less than zero, no matter what its bit-pattern.
`REAL_CST'
FIXME: Talk about how to obtain representations of this constant,
do comparisons, and so forth.
`COMPLEX_CST'
These nodes are used to represent complex number constants, that
is a `__complex__' whose parts are constant nodes. The
`TREE_REALPART' and `TREE_IMAGPART' return the real and the
imaginary parts respectively.
`STRING_CST'
These nodes represent string-constants. The `TREE_STRING_LENGTH'
returns the length of the string, as an `int'. The
`TREE_STRING_POINTER' is a `char*' containing the string itself.
The string may not be `NUL'-terminated, and it may contain
embedded `NUL' characters. Therefore, the `TREE_STRING_LENGTH'
includes the trailing `NUL' if it is present.
For wide string constants, the `TREE_STRING_LENGTH' is the number
of bytes in the string, and the `TREE_STRING_POINTER' points to an
array of the bytes of the string, as represented on the target
system (that is, as integers in the target endianness). Wide and
non-wide string constants are distinguished only by the `TREE_TYPE'
of the `STRING_CST'.
FIXME: The formats of string constants are not well-defined when
the target system bytes are not the same width as host system
bytes.
`PTRMEM_CST'
These nodes are used to represent pointer-to-member constants. The
`PTRMEM_CST_CLASS' is the class type (either a `RECORD_TYPE' or
`UNION_TYPE' within which the pointer points), and the
`PTRMEM_CST_MEMBER' is the declaration for the pointed to object.
Note that the `DECL_CONTEXT' for the `PTRMEM_CST_MEMBER' is in
general different from the `PTRMEM_CST_CLASS'. For example, given:
struct B { int i; };
struct D : public B {};
int D::*dp = &D::i;
The `PTRMEM_CST_CLASS' for `&D::i' is `D', even though the
`DECL_CONTEXT' for the `PTRMEM_CST_MEMBER' is `B', since `B::i' is
a member of `B', not `D'.
`VAR_DECL'
These nodes represent variables, including static data members.
For more information, Note:Declarations.
`NEGATE_EXPR'
These nodes represent unary negation of the single operand, for
both integer and floating-point types. The type of negation can be
determined by looking at the type of the expression.
`BIT_NOT_EXPR'
These nodes represent bitwise complement, and will always have
integral type. The only operand is the value to be complemented.
`TRUTH_NOT_EXPR'
These nodes represent logical negation, and will always have
integral (or boolean) type. The operand is the value being
negated.
`PREDECREMENT_EXPR'
`PREINCREMENT_EXPR'
`POSTDECREMENT_EXPR'
`POSTINCREMENT_EXPR'
These nodes represent increment and decrement expressions. The
value of the single operand is computed, and the operand
incremented or decremented. In the case of `PREDECREMENT_EXPR' and
`PREINCREMENT_EXPR', the value of the expression is the value
resulting after the increment or decrement; in the case of
`POSTDECREMENT_EXPR' and `POSTINCREMENT_EXPR' is the value before
the increment or decrement occurs. The type of the operand, like
that of the result, will be either integral, boolean, or
floating-point.
`ADDR_EXPR'
These nodes are used to represent the address of an object. (These
expressions will always have pointer or reference type.) The
operand may be another expression, or it may be a declaration.
As an extension, GCC allows users to take the address of a label.
In this case, the operand of the `ADDR_EXPR' will be a
`LABEL_DECL'. The type of such an expression is `void*'.
If the object addressed is not an lvalue, a temporary is created,
and the address of the temporary is used.
`INDIRECT_REF'
These nodes are used to represent the object pointed to by a
pointer. The operand is the pointer being dereferenced; it will
always have pointer or reference type.
`FIX_TRUNC_EXPR'
These nodes represent conversion of a floating-point value to an
integer. The single operand will have a floating-point type,
while the the complete expression will have an integral (or
boolean) type. The operand is rounded towards zero.
`FLOAT_EXPR'
These nodes represent conversion of an integral (or boolean) value
to a floating-point value. The single operand will have integral
type, while the complete expression will have a floating-point
type.
FIXME: How is the operand supposed to be rounded? Is this
dependent on `-mieee'?
`COMPLEX_EXPR'
These nodes are used to represent complex numbers constructed from
two expressions of the same (integer or real) type. The first
operand is the real part and the second operand is the imaginary
part.
`CONJ_EXPR'
These nodes represent the conjugate of their operand.
`REALPART_EXPR'
`IMAGPART_EXPR'
These nodes represent respectively the real and the imaginary parts
of complex numbers (their sole argument).
`NON_LVALUE_EXPR'
These nodes indicate that their one and only operand is not an
lvalue. A back end can treat these identically to the single
operand.
`NOP_EXPR'
These nodes are used to represent conversions that do not require
any code-generation. For example, conversion of a `char*' to an
`int*' does not require any code be generated; such a conversion is
represented by a `NOP_EXPR'. The single operand is the expression
to be converted. The conversion from a pointer to a reference is
also represented with a `NOP_EXPR'.
`CONVERT_EXPR'
These nodes are similar to `NOP_EXPR's, but are used in those
situations where code may need to be generated. For example, if an
`int*' is converted to an `int' code may need to be generated on
some platforms. These nodes are never used for C++-specific
conversions, like conversions between pointers to different
classes in an inheritance hierarchy. Any adjustments that need to
be made in such cases are always indicated explicitly. Similarly,
a user-defined conversion is never represented by a
`CONVERT_EXPR'; instead, the function calls are made explicit.
`THROW_EXPR'
These nodes represent `throw' expressions. The single operand is
an expression for the code that should be executed to throw the
exception. However, there is one implicit action not represented
in that expression; namely the call to `__throw'. This function
takes no arguments. If `setjmp'/`longjmp' exceptions are used, the
function `__sjthrow' is called instead. The normal GCC back end
uses the function `emit_throw' to generate this code; you can
examine this function to see what needs to be done.
`LSHIFT_EXPR'
`RSHIFT_EXPR'
These nodes represent left and right shifts, respectively. The
first operand is the value to shift; it will always be of integral
type. The second operand is an expression for the number of bits
by which to shift. Right shift should be treated as arithmetic,
i.e., the high-order bits should be zero-filled when the
expression has unsigned type and filled with the sign bit when the
expression has signed type.
`BIT_IOR_EXPR'
`BIT_XOR_EXPR'
`BIT_AND_EXPR'
These nodes represent bitwise inclusive or, bitwise exclusive or,
and bitwise and, respectively. Both operands will always have
integral type.
`TRUTH_ANDIF_EXPR'
`TRUTH_ORIF_EXPR'
These nodes represent logical and and logical or, respectively.
These operators are not strict; i.e., the second operand is
evaluated only if the value of the expression is not determined by
evaluation of the first operand. The type of the operands, and
the result type, is always of boolean or integral type.
`TRUTH_AND_EXPR'
`TRUTH_OR_EXPR'
`TRUTH_XOR_EXPR'
These nodes represent logical and, logical or, and logical
exclusive or. They are strict; both arguments are always
evaluated. There are no corresponding operators in C or C++, but
the front end will sometimes generate these expressions anyhow, if
it can tell that strictness does not matter.
`PLUS_EXPR'
`MINUS_EXPR'
`MULT_EXPR'
`TRUNC_DIV_EXPR'
`TRUNC_MOD_EXPR'
`RDIV_EXPR'
These nodes represent various binary arithmetic operations.
Respectively, these operations are addition, subtraction (of the
second operand from the first), multiplication, integer division,
integer remainder, and floating-point division. The operands to
the first three of these may have either integral or floating
type, but there will never be case in which one operand is of
floating type and the other is of integral type.
The result of a `TRUNC_DIV_EXPR' is always rounded towards zero.
The `TRUNC_MOD_EXPR' of two operands `a' and `b' is always `a -
a/b' where the division is as if computed by a `TRUNC_DIV_EXPR'.
`ARRAY_REF'
These nodes represent array accesses. The first operand is the
array; the second is the index. To calculate the address of the
memory accessed, you must scale the index by the size of the type
of the array elements.
`EXACT_DIV_EXPR'
Document.
`LT_EXPR'
`LE_EXPR'
`GT_EXPR'
`GE_EXPR'
`EQ_EXPR'
`NE_EXPR'
These nodes represent the less than, less than or equal to, greater
than, greater than or equal to, equal, and not equal comparison
operators. The first and second operand with either be both of
integral type or both of floating type. The result type of these
expressions will always be of integral or boolean type.
`MODIFY_EXPR'
These nodes represent assignment. The left-hand side is the first
operand; the right-hand side is the second operand. The left-hand
side will be a `VAR_DECL', `INDIRECT_REF', `COMPONENT_REF', or
other lvalue.
These nodes are used to represent not only assignment with `=' but
also compound assignments (like `+='), by reduction to `='
assignment. In other words, the representation for `i += 3' looks
just like that for `i = i + 3'.
`INIT_EXPR'
These nodes are just like `MODIFY_EXPR', but are used only when a
variable is initialized, rather than assigned to subsequently.
`COMPONENT_REF'
These nodes represent non-static data member accesses. The first
operand is the object (rather than a pointer to it); the second
operand is the `FIELD_DECL' for the data member.
`COMPOUND_EXPR'
These nodes represent comma-expressions. The first operand is an
expression whose value is computed and thrown away prior to the
evaluation of the second operand. The value of the entire
expression is the value of the second operand.
`COND_EXPR'
These nodes represent `?:' expressions. The first operand is of
boolean or integral type. If it evaluates to a nonzero value, the
second operand should be evaluated, and returned as the value of
the expression. Otherwise, the third operand is evaluated, and
returned as the value of the expression. As a GNU extension, the
middle operand of the `?:' operator may be omitted in the source,
like this:
x ? : 3
which is equivalent to
x ? x : 3
assuming that `x' is an expression without side-effects. However,
in the case that the first operation causes side effects, the
side-effects occur only once. Consumers of the internal
representation do not need to worry about this oddity; the second
operand will be always be present in the internal representation.
`CALL_EXPR'
These nodes are used to represent calls to functions, including
non-static member functions. The first operand is a pointer to the
function to call; it is always an expression whose type is a
`POINTER_TYPE'. The second argument is a `TREE_LIST'. The
arguments to the call appear left-to-right in the list. The
`TREE_VALUE' of each list node contains the expression
corresponding to that argument. (The value of `TREE_PURPOSE' for
these nodes is unspecified, and should be ignored.) For non-static
member functions, there will be an operand corresponding to the
`this' pointer. There will always be expressions corresponding to
all of the arguments, even if the function is declared with default
arguments and some arguments are not explicitly provided at the
call sites.
`STMT_EXPR'
These nodes are used to represent GCC's statement-expression
extension. The statement-expression extension allows code like
this:
int f() { return ({ int j; j = 3; j + 7; }); }
In other words, an sequence of statements may occur where a single
expression would normally appear. The `STMT_EXPR' node represents
such an expression. The `STMT_EXPR_STMT' gives the statement
contained in the expression; this is always a `COMPOUND_STMT'. The
value of the expression is the value of the last sub-statement in
the `COMPOUND_STMT'. More precisely, the value is the value
computed by the last `EXPR_STMT' in the outermost scope of the
`COMPOUND_STMT'. For example, in:
({ 3; })
the value is `3' while in:
({ if (x) { 3; } })
(represented by a nested `COMPOUND_STMT'), there is no value. If
the `STMT_EXPR' does not yield a value, it's type will be `void'.
`BIND_EXPR'
These nodes represent local blocks. The first operand is a list of
temporary variables, connected via their `TREE_CHAIN' field. These
will never require cleanups. The scope of these variables is just
the body of the `BIND_EXPR'. The body of the `BIND_EXPR' is the
second operand.
`LOOP_EXPR'
These nodes represent "infinite" loops. The `LOOP_EXPR_BODY'
represents the body of the loop. It should be executed forever,
unless an `EXIT_EXPR' is encountered.
`EXIT_EXPR'
These nodes represent conditional exits from the nearest enclosing
`LOOP_EXPR'. The single operand is the condition; if it is
nonzero, then the loop should be exited. An `EXIT_EXPR' will only
appear within a `LOOP_EXPR'.
`CLEANUP_POINT_EXPR'
These nodes represent full-expressions. The single operand is an
expression to evaluate. Any destructor calls engendered by the
creation of temporaries during the evaluation of that expression
should be performed immediately after the expression is evaluated.
`CONSTRUCTOR'
These nodes represent the brace-enclosed initializers for a
structure or array. The first operand is reserved for use by the
back end. The second operand is a `TREE_LIST'. If the
`TREE_TYPE' of the `CONSTRUCTOR' is a `RECORD_TYPE' or
`UNION_TYPE', then the `TREE_PURPOSE' of each node in the
`TREE_LIST' will be a `FIELD_DECL' and the `TREE_VALUE' of each
node will be the expression used to initialize that field. You
should not depend on the fields appearing in any particular order,
nor should you assume that all fields will be represented.
Unrepresented fields may be assigned any value.
If the `TREE_TYPE' of the `CONSTRUCTOR' is an `ARRAY_TYPE', then
the `TREE_PURPOSE' of each element in the `TREE_LIST' will be an
`INTEGER_CST'. This constant indicates which element of the array
(indexed from zero) is being assigned to; again, the `TREE_VALUE'
is the corresponding initializer. If the `TREE_PURPOSE' is
`NULL_TREE', then the initializer is for the next available array
element.
Conceptually, before any initialization is done, the entire area of
storage is initialized to zero.
`SAVE_EXPR'
A `SAVE_EXPR' represents an expression (possibly involving
side-effects) that is used more than once. The side-effects should
occur only the first time the expression is evaluated. Subsequent
uses should just reuse the computed value. The first operand to
the `SAVE_EXPR' is the expression to evaluate. The side-effects
should be executed where the `SAVE_EXPR' is first encountered in a
depth-first preorder traversal of the expression tree.
`TARGET_EXPR'
A `TARGET_EXPR' represents a temporary object. The first operand
is a `VAR_DECL' for the temporary variable. The second operand is
the initializer for the temporary. The initializer is evaluated,
and copied (bitwise) into the temporary.
Often, a `TARGET_EXPR' occurs on the right-hand side of an
assignment, or as the second operand to a comma-expression which is
itself the right-hand side of an assignment, etc. In this case,
we say that the `TARGET_EXPR' is "normal"; otherwise, we say it is
"orphaned". For a normal `TARGET_EXPR' the temporary variable
should be treated as an alias for the left-hand side of the
assignment, rather than as a new temporary variable.
The third operand to the `TARGET_EXPR', if present, is a
cleanup-expression (i.e., destructor call) for the temporary. If
this expression is orphaned, then this expression must be executed
when the statement containing this expression is complete. These
cleanups must always be executed in the order opposite to that in
which they were encountered. Note that if a temporary is created
on one branch of a conditional operator (i.e., in the second or
third operand to a `COND_EXPR'), the cleanup must be run only if
that branch is actually executed.
See `STMT_IS_FULL_EXPR_P' for more information about running these
cleanups.
`AGGR_INIT_EXPR'
An `AGGR_INIT_EXPR' represents the initialization as the return
value of a function call, or as the result of a constructor. An
`AGGR_INIT_EXPR' will only appear as the second operand of a
`TARGET_EXPR'. The first operand to the `AGGR_INIT_EXPR' is the
address of a function to call, just as in a `CALL_EXPR'. The
second operand are the arguments to pass that function, as a
`TREE_LIST', again in a manner similar to that of a `CALL_EXPR'.
The value of the expression is that returned by the function.
If `AGGR_INIT_VIA_CTOR_P' holds of the `AGGR_INIT_EXPR', then the
initialization is via a constructor call. The address of the third
operand of the `AGGR_INIT_EXPR', which is always a `VAR_DECL', is
taken, and this value replaces the first argument in the argument
list. In this case, the value of the expression is the `VAR_DECL'
given by the third operand to the `AGGR_INIT_EXPR'; constructors do
not return a value.