This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] fold-const.c merge from mainline
- From: law at redhat dot com
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 13 Oct 2003 19:44:18 -0600
- Subject: Re: [tree-ssa] fold-const.c merge from mainline
- Reply-to: law at redhat dot com
In message <Pine.LNX.4.44.0310131649420.25021-100000@www.eyesopen.com>, Roger S
ayle writes:
>Then all that remains is the "Performance" argument, to which there
>are three strong counter arguments.
>
>(1) Whilst it is true that CCP only cares about cases when we can fold
>down to a constant, it is frequently the case that fold reduces an
>tree expression to a constant but in a number of steps. For example,
>"(x + 3) - x" can be reduced by "fold" via the recursive call to
>"(x - x) + 3". Unfortunately, nondestructive_fold_binary_to_constant
>on the tree-ssa branch isn't able to do this.
But you don't get those kinds of expressions during CCP. They simply
don't happen.
>(2) I strongly doubt that performance is an issue. Although, "fold"
>as a function is large, it is principally a single switch statement
>meaning that the only code executed are the transformations applicable
>to the current TREE_CODE. I'll agree there's plenty that could be
>done to manually factor tests [Indeed my original motivation for GCC's
>jump bypassing pass was to optimize control flow through fold].
My recollection was that it was significant and measurable.
>However, as with all good discussions on performance, the ultimate
>arbiter is experimental measurement. At the time you decided to split
>fold, its destructive nature mean't that it was impossible to assess
>overhead. However, it should now be possible to time a tree-ssa
>variant that implements nondestructive_fold_* as wrappers around fold.
Err, yes it was possible to measure the overhead. What we used to do was
copy the expression, fold it, then do the replacement when all the lattice
values were stable.
It was certainly the case that the expression copying was *very* expensive,
both in direct terms and indirect terms (ie, impact on the garbage collector
and data locality). However, my recollection was that the folder itself
was also showing up in significant ways in the profiles.
>(3) But the most significant problem I see with the approach taken on
>the tree-ssa branch is the duplication of tree-level constant folding
>logic. For example, plans of mine to add constant folding support
>for "local" range propagation, "((unsigned int)x | 8) > 2" will either
>be useless to SSA optimizers (such as ssa-vrp!) or duplicated.
Again, you won't see this kind of expression in CCP because it's not a
gimple expression.
Jeff