[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Jan 8 14:48:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The duffs device doesn't need to be done with computed jump, it can be done
with 3 conditional branches + 3 comparisons too.  The advantage of doing that
is especially if the iter isn't really very small, by doing it that way you
don't need those 4 unrolled iterations + one scalar loop.
if (n & 2)
  {
    if (n & 1) { n++; goto loop3; }
    { n += 2; goto loop2; }
  }
else if (n & 1)
  { n += 3; goto loop1; }
else if (n == 0)
  goto end;
do
  {
    iter;
  loop3:
    iter;
  loop2:
    iter;
  loop1:
    iter;
    n -= 4;
  }
while (n != 0);
end:;

Of course, if iter is very short, it might be easier/more efficient to
duplicate iter more times than 4 and do something else.


More information about the Gcc-bugs mailing list