[PATCH] Improved constant folding for scalar evolution.

Richard Biener richard.guenther@gmail.com
Mon Feb 21 07:26:48 GMT 2022


On Sun, Feb 20, 2022 at 2:50 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch adds a small (follow-up) optimization to chrec_apply for
> linear chrecs to clean-up the final value expressions sometimes generated
> by GCC's scalar evolution pass.  The transformation of A+(X-1)*A into
> A*X is usually unsafe with respect to overflow (see PR92712), and so
> can't be performed by match.pd (or fold-const).  However, during scalar
> evolution's evaluation of recurrences it's known that X-1 can't be negative
> (in fact X-1 is unsigned even when A is signed), hence this optimization
> can be applied.  Interestingly, this expression does get simplified in
> later passes once the range of X-1 is bounded by VRP, but that occurs
> long after the decision of whether to perform final value replacement,
> which is based on the complexity of this expression.

In principle A + (X-1)*A can be always simplified to (unsigned)A * (unsigned)X,
but at least fold-consts fold_plusminus_mult has

  /* Do not resort to unsigned multiplication because
     we lose the no-overflow property of the expression.  */
  return NULL_TREE;

we might want to heuristically do that anyway if the result is not a
multiplication
by a constant (I remember doing the above because of testsuite regressions).

It might be also interesting to see whether we change back
(int)((unsigned)A * (unsigned)X)
to A * X when we can constrain ranges.

In the specific case of the testcase below we of course only know overflow
doesn't happen because it would be undefined behavior.

> The motivating test case is the optimization of the loop (from comment
> #7 of PR65855):
>
> int square(int x) {
>   int result = 0;
>   for (int i = 0; i < x; ++i)
>     result += x;
>   return result;
> }
>
> which is currently optimized, with final value replacement to:
>
>   final value replacement:
>    with expr: (int) ((unsigned int) x_3(D) + 4294967295) * x_3(D) + x_3(D)
>
> but with this patch it first gets simplified further:
>
>   final value replacement:
>    with expr: x_3(D) * x_3(D)
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?

OK once stage1 opens.

Thanks,
Richard.

>
> 2022-02-20  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * tree-chrec.cc (chrec_apply): Attempt to fold the linear chrec
>         "{a, +, a} (x-1)" as "a*x", as the number of loop iterations, x-1,
>         can't be negative.
>
> gcc/testsuite/ChangeLog
>         * gcc.dg/tree-ssa/pr65855-2.c: New test case.
>
>
> Roger
> --
>


More information about the Gcc-patches mailing list