[PATCH] Fix canonicalization of addresses
Richard Guenther
richard.guenther@gmail.com
Mon Dec 29 23:07:00 GMT 2008
On Mon, Dec 29, 2008 at 10:02 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> So after the patch we would canonicalize the above to base + (ind - 1) * 2?
>
> Yes, that's still suboptimal for most targets but simply disabling the
> factorization doesn't seem to be doable at this point.
>
>> To even catch the case where the user writes (1-ind)*-2 can you instead
>> extend fold_binary at
>>
>> case MULT_EXPR:
>> /* (-A) * (-B) -> A * B */
>> if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1))
>> return fold_build2 (MULT_EXPR, type,
>> fold_convert (type, TREE_OPERAND (arg0, 0)),
>> fold_convert (type, negate_expr (arg1)));
>> if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0))
>> return fold_build2 (MULT_EXPR, type,
>> fold_convert (type, negate_expr (arg0)),
>> fold_convert (type, TREE_OPERAND (arg1, 0)));
>>
>> to also handle negative constants instead of just NEGATE_EXPR?
>>
>> IMHO this would be the better approach.
>
> What would be better than what exactly?
In the above case transforming -A * -CST to A * CST (which would handle
the (1 - ind) * -2 case also if witten that way by a user, not only if generated
by fold_plusminus_mult). Thus, it subsumes the plusminus_mult patch
in favor of a IMHO better one.
>> I consider the expr.c hunk a hack - what fixes this up at -O2 vs. -Os?
>
> We have a testcase for which it hugely helps at -O2 because you can CSE a
> bunch of (base + ind*2) calculations, leaving only the displacements as
> adjustments. When the displacements are inside the *2, CSE is not as
> effective and addresses are needlessly recomputed.
Well, sure. But then you can construct a testcase where you can CSE
the (ind + displacement)*2 for different bases. It's only a
canonicalization, it
isn't supposed to be the optimization ;)
I think a proper place to change canonicalization would be during
induction variable optimization - why does that not happen? It should
at least in theory decompose the addresse to affine combinations and
see this opportunity.
>> Can we enable that at -Os as well instead?
>
> IMO the transformations should be done at any optimization level, they just
> get us back to where we were before.
I was asking, with -O2 (where we do the same folds), what does "fix" it without
your patch? Thus, why is the regression only there for -Os?
Thanks,
Richard.
More information about the Gcc-patches
mailing list