[PATCH] Final value replacement improvements for until-wrap loops.

Roger Sayle roger@nextmovesoftware.com
Mon Nov 29 09:04:41 GMT 2021


This middle-end patch is inspired by Richard Biener's until-wrap
loop example in PR tree-optimization/101145.

unsigned foo(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  for (unsigned i = start; i > val; ++i)
    cnt++;
  return cnt;
}

For this loop, the tree optimizers currently generate:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;
  unsigned int _1;
  unsigned int _5;

  <bb 2> [local count: 118111600]:
  if (start_3(D) > val_4(D))
    goto <bb 3>; [89.00%]
  else
    goto <bb 4>; [11.00%]

  <bb 3> [local count: 105119324]:
  _1 = start_3(D) + 1;
  _5 = -start_3(D);
  cnt_2 = _1 > val_4(D) ? _5 : 1;

  <bb 4> [local count: 118111600]:
  # cnt_11 = PHI <cnt_2(3), 0(2)>
  return cnt_11;
}

or perhaps slightly easier to read:

  if (start > val) {
    cnt = (start+1) > val ? -start : 1;
  } else cnt = 0;

In this snippet, if we know start > val, then (start+1) > val
unless start+1 overflows, i.e. (start+1) == 0 and start == ~0.
We can use this (loop header) context to simplify the ternary
expression to "(start != -1) ? -start : 1", which with a little
help from match.pd can be folded to -start.  Hence the optimal
final value replacement should be:

  cnt = (start > val) ? -start : 0;

Or as now generated by this patch:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;

  <bb 2> [local count: 118111600]:
  if (start_3(D) > val_4(D))
    goto <bb 3>; [89.00%]
  else
    goto <bb 4>; [11.00%]

  <bb 3> [local count: 105119324]:
  cnt_2 = -start_3(D);

  <bb 4> [local count: 118111600]:
  # cnt_11 = PHI <cnt_2(3), 0(2)>
  return cnt_11;
}


We can also improve until-wrap loops that don't have a (suitable) loop
header, as determined by simplify_using_initial_conditions.

unsigned bar(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  unsigned i = start;
  do {
    cnt++;
    i++;
  } while (i > val);
  return cnt;
}

which is currently optimized to:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;
  unsigned int _9;
  unsigned int _10;

  <bb 2> [local count: 118111600]:
  _9 = start_4(D) + 1;
  _10 = -start_4(D);
  cnt_3 = val_7(D) < _9 ? _10 : 1;
  return cnt_3;
}

Here we have "val < (start+1) ? -start : 1", which again with the
help of match.pd can be slightly simplified to "val <= start ? -start : 1"
when dealing with unsigned types, because at the complicating value where
start == ~0, we fortunately have -start == 1, hence it doesn't matter
whether the second or third operand of the ternary operator is returned.

To summarize, this patch (in addition to tweaking may_be_zero in
number_of_iterations_until_wrap) adds three new constant folding
transforms to match.pd.

X != C1 ? -X : C2 simplifies to -X when -C1 == C2.
which is the generalized form of the simplification above.

X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2.
which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case.

and the "until-wrap final value replacement without context":

(X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2021-11-29  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
	Check if simplify_using_initial_conditions allows us to
	simplify the expression for may_be_zero.
	* match.pd (X != C ? -X : -C -> -X): New transform.
	(X != C ? ~X : ~C -> ~X): Likewise.
	((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise.

gcc/testsuite/ChangeLog
	* gcc.dg/fold-condneg-1.c: New test case.
	* gcc.dg/fold-condneg-2.c: New test case.
	* gcc.dg/fold-condnot-1.c: New test case.
	* gcc.dg/pr101145-1.c: New test case.
	* gcc.dg/pr101145-2.c: New test case.


Thanks in advance,
Roger
--

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patchn3b.txt
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20211129/cf9a20a9/attachment.txt>


More information about the Gcc-patches mailing list