[Bug tree-optimization/102756] [12 Regression] Complete unrolling is too senative to PRE; c-c++-common/torture/vector-compare-2.c

Wed Jan 19 08:22:02 GMT 2022

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102756

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, the same happens on x86-64.  With -O2 and vectorization we end up with

  <bb 3> [local count: 858993457]:
  # ivtmp.14_11 = PHI <ivtmp.14_12(5), ivtmp.14_13(2)>
  _14 = (void *) ivtmp.14_11;
  _1 = MEM <int> [(vector(4) int *)_14];
  if (_1 != -3)
    goto <bb 4>; [0.00%]
  else
    goto <bb 5>; [100.00%]

  <bb 4> [count: 0]:
  __builtin_abort ();

  <bb 5> [local count: 858993457]:
  ivtmp.14_12 = ivtmp.14_11 + 4;
  if (ivtmp.14_12 != _16)
    goto <bb 3>; [80.00%]
  else
    goto <bb 6>; [20.00%]

  <bb 6> [local count: 214748368]:
  r ={v} {CLOBBER};

while everything is optimized away with -O2 -fno-tree-vectorize.

Let's keep this open as a regression since -O2 now enables vectorization.  In
principle we could preserve the previous behavior for the very-cheap
vectorizer cost model or adjust the heuristic for that case to only cover
loops with a single BB.

The real issue here is of course the unroller not considering the true
size after simplification.