This is the mail archive of the
mailing list for the GCC project.
Re: [LTO] patch: new CALL_EXPR abstractions in builtins.c
- From: Roger Sayle <roger at eyesopen dot com>
- To: Sandra Loosemore <sandra at codesourcery dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 27 Jun 2006 11:00:06 -0600 (MDT)
- Subject: Re: [LTO] patch: new CALL_EXPR abstractions in builtins.c
On Tue, 27 Jun 2006, Sandra Loosemore wrote:
> This is the first in a series of patches which will eventually result in
> changing the representation of CALL_EXPRs so that arguments are stored
> directly as operands instead of in a TREE_LIST. (If I run into some
> intractable problem with having a tree_exp node with a variably-sized
> operand array, my backup plan is to store the first few operands in the
> node and use a TREE_VEC for the overflow; I ran some experiments on
> some large C programs that indicated that calls with 2 arguments or
> less account for around 70% of all CALL_EXPRs, so it makes sense to
> optimize at least the most common cases.
Its disappointing the way that we now have to speculatively construct
trees (representing the CALL_EXPRs) to test whether a built-in function
can be folded. One of the recent pushes with tree-level optimizers has
been to move towards APIs that allow us to fold/simplify them without
having to explicitly construct them first, and thereby leak memory to
the garbage collector.
I suspect a more agressive restructuring of this code, introducing new
APIs, such as fold_unary_builtin, fold_binary_builtin, etc... where we
explicitly pass arg0, arg1 ... argN in the common cases (most builtins
don't have a variable number of arguments, and often only one or two
operands/arguments) would address the majority of the memory regression
incurred by this change.
If we're going to treat CALL_EXPRs like other "operation" tree nodes
with explicit operands, then it makes sense to process them the same
way that we need to handle other operator tree codes. For example,
using fold_binary we no longer have to build an explicit PLUS_EXPR
tree to see if it can be simplified.
We may only allocate an actual CALL_EXPR once, but may speculatively
attempt to fold it many tens of times in the tree-ssa optimizers. It
would be a shame that in order to reduce the footprint of the CALL_EXPR,
we have to repeatedly leak memory to the garbage collector during DOM,
PRE, CCP, out-of-ssa, etc... I'm not sure the LTO branch is far enough
along to measure the tradeoff of max allocated memory vs. max "live"
I don't think that its too much overhead to need to construct a full
CALL_EXPR to handle variable argument lists (such as sprintf, et al.)
or to handle builtins with four or more arguments, but it would be
nice if we could at least reduce the overhead for memcmp, strcmp,
memset, abs, etc... that are not only frequent in real codes, but
also synthesized by the compiler.
I hope this helps. Let me know what you think.