Copyright (C) 2000-2012 |
GNU Info (gcc-300.info)Expression treesExpressions =========== The internal representation for expressions is for the most part quite straightforward. However, there are a few facts that one must bear in mind. In particular, the expression "tree" is actually a directed acyclic graph. (For example there may be many references to the integer constant zero throughout the source program; many of these will be represented by the same expression node.) You should not rely on certain kinds of node being shared, nor should rely on certain kinds of nodes being unshared. The following macros can be used with all expression nodes: `TREE_TYPE' Returns the type of the expression. This value may not be precisely the same type that would be given the expression in the original program. In what follows, some nodes that one might expect to always have type `bool' are documented to have either integral or boolean type. At some point in the future, the C front end may also make use of this same intermediate representation, and at this point these nodes will certainly have integral type. The previous sentence is not meant to imply that the C++ front end does not or will not give these nodes integral type. Below, we list the various kinds of expression nodes. Except where noted otherwise, the operands to an expression are accessed using the `TREE_OPERAND' macro. For example, to access the first operand to a binary plus expression `expr', use: TREE_OPERAND (expr, 0) As this example indicates, the operands are zero-indexed. The table below begins with constants, moves on to unary expressions, then proceeds to binary expressions, and concludes with various other kinds of expressions: `INTEGER_CST' These nodes represent integer constants. Note that the type of these constants is obtained with `TREE_TYPE'; they are not always of type `int'. In particular, `char' constants are represented with `INTEGER_CST' nodes. The value of the integer constant `e' is given by ((TREE_INT_CST_HIGH (e) << HOST_BITS_PER_WIDE_INT) + TREE_INST_CST_LOW (e)) HOST_BITS_PER_WIDE_INT is at least thirty-two on all platforms. Both `TREE_INT_CST_HIGH' and `TREE_INT_CST_LOW' return a `HOST_WIDE_INT'. The value of an `INTEGER_CST' is interpreted as a signed or unsigned quantity depending on the type of the constant. In general, the expression given above will overflow, so it should not be used to calculate the value of the constant. The variable `integer_zero_node' is an integer constant with value zero. Similarly, `integer_one_node' is an integer constant with value one. The `size_zero_node' and `size_one_node' variables are analogous, but have type `size_t' rather than `int'. The function `tree_int_cst_lt' is a predicate which holds if its first argument is less than its second. Both constants are assumed to have the same signedness (i.e., either both should be signed or both should be unsigned.) The full width of the constant is used when doing the comparison; the usual rules about promotions and conversions are ignored. Similarly, `tree_int_cst_equal' holds if the two constants are equal. The `tree_int_cst_sgn' function returns the sign of a constant. The value is `1', `0', or `-1' according on whether the constant is greater than, equal to, or less than zero. Again, the signedness of the constant's type is taken into account; an unsigned constant is never less than zero, no matter what its bit-pattern. `REAL_CST' FIXME: Talk about how to obtain representations of this constant, do comparisons, and so forth. `COMPLEX_CST' These nodes are used to represent complex number constants, that is a `__complex__' whose parts are constant nodes. The `TREE_REALPART' and `TREE_IMAGPART' return the real and the imaginary parts respectively. `STRING_CST' These nodes represent string-constants. The `TREE_STRING_LENGTH' returns the length of the string, as an `int'. The `TREE_STRING_POINTER' is a `char*' containing the string itself. The string may not be `NUL'-terminated, and it may contain embedded `NUL' characters. Therefore, the `TREE_STRING_LENGTH' includes the trailing `NUL' if it is present. For wide string constants, the `TREE_STRING_LENGTH' is the number of bytes in the string, and the `TREE_STRING_POINTER' points to an array of the bytes of the string, as represented on the target system (that is, as integers in the target endianness). Wide and non-wide string constants are distinguished only by the `TREE_TYPE' of the `STRING_CST'. FIXME: The formats of string constants are not well-defined when the target system bytes are not the same width as host system bytes. `PTRMEM_CST' These nodes are used to represent pointer-to-member constants. The `PTRMEM_CST_CLASS' is the class type (either a `RECORD_TYPE' or `UNION_TYPE' within which the pointer points), and the `PTRMEM_CST_MEMBER' is the declaration for the pointed to object. Note that the `DECL_CONTEXT' for the `PTRMEM_CST_MEMBER' is in general different from the `PTRMEM_CST_CLASS'. For example, given: struct B { int i; }; struct D : public B {}; int D::*dp = &D::i; The `PTRMEM_CST_CLASS' for `&D::i' is `D', even though the `DECL_CONTEXT' for the `PTRMEM_CST_MEMBER' is `B', since `B::i' is a member of `B', not `D'. `VAR_DECL' These nodes represent variables, including static data members. For more information, Note: Declarations. `NEGATE_EXPR' These nodes represent unary negation of the single operand, for both integer and floating-point types. The type of negation can be determined by looking at the type of the expression. `BIT_NOT_EXPR' These nodes represent bitwise complement, and will always have integral type. The only operand is the value to be complemented. `TRUTH_NOT_EXPR' These nodes represent logical negation, and will always have integral (or boolean) type. The operand is the value being negated. `PREDECREMENT_EXPR' `PREINCREMENT_EXPR' `POSTDECREMENT_EXPR' `POSTINCREMENT_EXPR' These nodes represent increment and decrement expressions. The value of the single operand is computed, and the operand incremented or decremented. In the case of `PREDECREMENT_EXPR' and `PREINCREMENT_EXPR', the value of the expression is the value resulting after the increment or decrement; in the case of `POSTDECREMENT_EXPR' and `POSTINCREMENT_EXPR' is the value before the increment or decrement occurs. The type of the operand, like that of the result, will be either integral, boolean, or floating-point. `ADDR_EXPR' These nodes are used to represent the address of an object. (These expressions will always have pointer or reference type.) The operand may be another expression, or it may be a declaration. As an extension, GCC allows users to take the address of a label. In this case, the operand of the `ADDR_EXPR' will be a `LABEL_DECL'. The type of such an expression is `void*'. If the object addressed is not an lvalue, a temporary is created, and the address of the temporary is used. `INDIRECT_REF' These nodes are used to represent the object pointed to by a pointer. The operand is the pointer being dereferenced; it will always have pointer or reference type. `FIX_TRUNC_EXPR' These nodes represent conversion of a floating-point value to an integer. The single operand will have a floating-point type, while the the complete expression will have an integral (or boolean) type. The operand is rounded towards zero. `FLOAT_EXPR' These nodes represent conversion of an integral (or boolean) value to a floating-point value. The single operand will have integral type, while the complete expression will have a floating-point type. FIXME: How is the operand supposed to be rounded? Is this dependent on `-mieee'? `COMPLEX_EXPR' These nodes are used to represent complex numbers constructed from two expressions of the same (integer or real) type. The first operand is the real part and the second operand is the imaginary part. `CONJ_EXPR' These nodes represent the conjugate of their operand. `REALPART_EXPR' `IMAGPART_EXPR' These nodes represent respectively the real and the imaginary parts of complex numbers (their sole argument). `NON_LVALUE_EXPR' These nodes indicate that their one and only operand is not an lvalue. A back end can treat these identically to the single operand. `NOP_EXPR' These nodes are used to represent conversions that do not require any code-generation. For example, conversion of a `char*' to an `int*' does not require any code be generated; such a conversion is represented by a `NOP_EXPR'. The single operand is the expression to be converted. The conversion from a pointer to a reference is also represented with a `NOP_EXPR'. `CONVERT_EXPR' These nodes are similar to `NOP_EXPR's, but are used in those situations where code may need to be generated. For example, if an `int*' is converted to an `int' code may need to be generated on some platforms. These nodes are never used for C++-specific conversions, like conversions between pointers to different classes in an inheritance hierarchy. Any adjustments that need to be made in such cases are always indicated explicitly. Similarly, a user-defined conversion is never represented by a `CONVERT_EXPR'; instead, the function calls are made explicit. `THROW_EXPR' These nodes represent `throw' expressions. The single operand is an expression for the code that should be executed to throw the exception. However, there is one implicit action not represented in that expression; namely the call to `__throw'. This function takes no arguments. If `setjmp'/`longjmp' exceptions are used, the function `__sjthrow' is called instead. The normal GCC back end uses the function `emit_throw' to generate this code; you can examine this function to see what needs to be done. `LSHIFT_EXPR' `RSHIFT_EXPR' These nodes represent left and right shifts, respectively. The first operand is the value to shift; it will always be of integral type. The second operand is an expression for the number of bits by which to shift. Right shift should be treated as arithmetic, i.e., the high-order bits should be zero-filled when the expression has unsigned type and filled with the sign bit when the expression has signed type. `BIT_IOR_EXPR' `BIT_XOR_EXPR' `BIT_AND_EXPR' These nodes represent bitwise inclusive or, bitwise exclusive or, and bitwise and, respectively. Both operands will always have integral type. `TRUTH_ANDIF_EXPR' `TRUTH_ORIF_EXPR' These nodes represent logical and and logical or, respectively. These operators are not strict; i.e., the second operand is evaluated only if the value of the expression is not determined by evaluation of the first operand. The type of the operands, and the result type, is always of boolean or integral type. `TRUTH_AND_EXPR' `TRUTH_OR_EXPR' `TRUTH_XOR_EXPR' These nodes represent logical and, logical or, and logical exclusive or. They are strict; both arguments are always evaluated. There are no corresponding operators in C or C++, but the front end will sometimes generate these expressions anyhow, if it can tell that strictness does not matter. `PLUS_EXPR' `MINUS_EXPR' `MULT_EXPR' `TRUNC_DIV_EXPR' `TRUNC_MOD_EXPR' `RDIV_EXPR' These nodes represent various binary arithmetic operations. Respectively, these operations are addition, subtraction (of the second operand from the first), multiplication, integer division, integer remainder, and floating-point division. The operands to the first three of these may have either integral or floating type, but there will never be case in which one operand is of floating type and the other is of integral type. The result of a `TRUNC_DIV_EXPR' is always rounded towards zero. The `TRUNC_MOD_EXPR' of two operands `a' and `b' is always `a - a/b' where the division is as if computed by a `TRUNC_DIV_EXPR'. `ARRAY_REF' These nodes represent array accesses. The first operand is the array; the second is the index. To calculate the address of the memory accessed, you must scale the index by the size of the type of the array elements. `EXACT_DIV_EXPR' Document. `LT_EXPR' `LE_EXPR' `GT_EXPR' `GE_EXPR' `EQ_EXPR' `NE_EXPR' These nodes represent the less than, less than or equal to, greater than, greater than or equal to, equal, and not equal comparison operators. The first and second operand with either be both of integral type or both of floating type. The result type of these expressions will always be of integral or boolean type. `MODIFY_EXPR' These nodes represent assignment. The left-hand side is the first operand; the right-hand side is the second operand. The left-hand side will be a `VAR_DECL', `INDIRECT_REF', `COMPONENT_REF', or other lvalue. These nodes are used to represent not only assignment with `=' but also compound assignments (like `+='), by reduction to `=' assignment. In other words, the representation for `i += 3' looks just like that for `i = i + 3'. `INIT_EXPR' These nodes are just like `MODIFY_EXPR', but are used only when a variable is initialized, rather than assigned to subsequently. `COMPONENT_REF' These nodes represent non-static data member accesses. The first operand is the object (rather than a pointer to it); the second operand is the `FIELD_DECL' for the data member. `COMPOUND_EXPR' These nodes represent comma-expressions. The first operand is an expression whose value is computed and thrown away prior to the evaluation of the second operand. The value of the entire expression is the value of the second operand. `COND_EXPR' These nodes represent `?:' expressions. The first operand is of boolean or integral type. If it evaluates to a nonzero value, the second operand should be evaluated, and returned as the value of the expression. Otherwise, the third operand is evaluated, and returned as the value of the expression. As a GNU extension, the middle operand of the `?:' operator may be omitted in the source, like this: x ? : 3 which is equivalent to x ? x : 3 assuming that `x' is an expression without side-effects. However, in the case that the first operation causes side effects, the side-effects occur only once. Consumers of the internal representation do not need to worry about this oddity; the second operand will be always be present in the internal representation. `CALL_EXPR' These nodes are used to represent calls to functions, including non-static member functions. The first operand is a pointer to the function to call; it is always an expression whose type is a `POINTER_TYPE'. The second argument is a `TREE_LIST'. The arguments to the call appear left-to-right in the list. The `TREE_VALUE' of each list node contains the expression corresponding to that argument. (The value of `TREE_PURPOSE' for these nodes is unspecified, and should be ignored.) For non-static member functions, there will be an operand corresponding to the `this' pointer. There will always be expressions corresponding to all of the arguments, even if the function is declared with default arguments and some arguments are not explicitly provided at the call sites. `STMT_EXPR' These nodes are used to represent GCC's statement-expression extension. The statement-expression extension allows code like this: int f() { return ({ int j; j = 3; j + 7; }); } In other words, an sequence of statements may occur where a single expression would normally appear. The `STMT_EXPR' node represents such an expression. The `STMT_EXPR_STMT' gives the statement contained in the expression; this is always a `COMPOUND_STMT'. The value of the expression is the value of the last sub-statement in the `COMPOUND_STMT'. More precisely, the value is the value computed by the last `EXPR_STMT' in the outermost scope of the `COMPOUND_STMT'. For example, in: ({ 3; }) the value is `3' while in: ({ if (x) { 3; } }) (represented by a nested `COMPOUND_STMT'), there is no value. If the `STMT_EXPR' does not yield a value, it's type will be `void'. `BIND_EXPR' These nodes represent local blocks. The first operand is a list of temporary variables, connected via their `TREE_CHAIN' field. These will never require cleanups. The scope of these variables is just the body of the `BIND_EXPR'. The body of the `BIND_EXPR' is the second operand. `LOOP_EXPR' These nodes represent "infinite" loops. The `LOOP_EXPR_BODY' represents the body of the loop. It should be executed forever, unless an `EXIT_EXPR' is encountered. `EXIT_EXPR' These nodes represent conditional exits from the nearest enclosing `LOOP_EXPR'. The single operand is the condition; if it is nonzero, then the loop should be exited. An `EXIT_EXPR' will only appear within a `LOOP_EXPR'. `CLEANUP_POINT_EXPR' These nodes represent full-expressions. The single operand is an expression to evaluate. Any destructor calls engendered by the creation of temporaries during the evaluation of that expression should be performed immediately after the expression is evaluated. `CONSTRUCTOR' These nodes represent the brace-enclosed initializers for a structure or array. The first operand is reserved for use by the back end. The second operand is a `TREE_LIST'. If the `TREE_TYPE' of the `CONSTRUCTOR' is a `RECORD_TYPE' or `UNION_TYPE', then the `TREE_PURPOSE' of each node in the `TREE_LIST' will be a `FIELD_DECL' and the `TREE_VALUE' of each node will be the expression used to initialize that field. You should not depend on the fields appearing in any particular order, nor should you assume that all fields will be represented. Unrepresented fields may be assigned any value. If the `TREE_TYPE' of the `CONSTRUCTOR' is an `ARRAY_TYPE', then the `TREE_PURPOSE' of each element in the `TREE_LIST' will be an `INTEGER_CST'. This constant indicates which element of the array (indexed from zero) is being assigned to; again, the `TREE_VALUE' is the corresponding initializer. If the `TREE_PURPOSE' is `NULL_TREE', then the initializer is for the next available array element. Conceptually, before any initialization is done, the entire area of storage is initialized to zero. `SAVE_EXPR' A `SAVE_EXPR' represents an expression (possibly involving side-effects) that is used more than once. The side-effects should occur only the first time the expression is evaluated. Subsequent uses should just reuse the computed value. The first operand to the `SAVE_EXPR' is the expression to evaluate. The side-effects should be executed where the `SAVE_EXPR' is first encountered in a depth-first preorder traversal of the expression tree. `TARGET_EXPR' A `TARGET_EXPR' represents a temporary object. The first operand is a `VAR_DECL' for the temporary variable. The second operand is the initializer for the temporary. The initializer is evaluated, and copied (bitwise) into the temporary. Often, a `TARGET_EXPR' occurs on the right-hand side of an assignment, or as the second operand to a comma-expression which is itself the right-hand side of an assignment, etc. In this case, we say that the `TARGET_EXPR' is "normal"; otherwise, we say it is "orphaned". For a normal `TARGET_EXPR' the temporary variable should be treated as an alias for the left-hand side of the assignment, rather than as a new temporary variable. The third operand to the `TARGET_EXPR', if present, is a cleanup-expression (i.e., destructor call) for the temporary. If this expression is orphaned, then this expression must be executed when the statement containing this expression is complete. These cleanups must always be executed in the order opposite to that in which they were encountered. Note that if a temporary is created on one branch of a conditional operator (i.e., in the second or third operand to a `COND_EXPR'), the cleanup must be run only if that branch is actually executed. See `STMT_IS_FULL_EXPR_P' for more information about running these cleanups. `AGGR_INIT_EXPR' An `AGGR_INIT_EXPR' represents the initialization as the return value of a function call, or as the result of a constructor. An `AGGR_INIT_EXPR' will only appear as the second operand of a `TARGET_EXPR'. The first operand to the `AGGR_INIT_EXPR' is the address of a function to call, just as in a `CALL_EXPR'. The second operand are the arguments to pass that function, as a `TREE_LIST', again in a manner similar to that of a `CALL_EXPR'. The value of the expression is that returned by the function. If `AGGR_INIT_VIA_CTOR_P' holds of the `AGGR_INIT_EXPR', then the initialization is via a constructor call. The address of the third operand of the `AGGR_INIT_EXPR', which is always a `VAR_DECL', is taken, and this value replaces the first argument in the argument list. In this case, the value of the expression is the `VAR_DECL' given by the third operand to the `AGGR_INIT_EXPR'; constructors do not return a value. automatically generated by info2www version 1.2.2.9 |