[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

guojiufu at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu May 21 05:29:51 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398

--- Comment #26 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
Had a test on spec2017 xz_r by changing the specified loop manually, on
ppc64le.

original loop (this loops occur three times in code):
                        while (++len != len_limit)
                                if (pb[len] != cur[len])
                                        break;
changed to loop:
typedef long long __attribute__((may_alias)) TYPEE;

  for(++len; len + sizeof(TYPEE) <= len_limit; len += sizeof(TYPEE)) {
    long long a = *((TYPEE*)(cur+len));
    long long b = *((TYPEE*)(pb+len));
    if (a != b) {
      break; //to optimize len can be move forward here.
      }
    }
  for (;len != len_limit; ++len)
    if (pb[len] != cur[len])
      break;

We can see xz_r runtime improved from 433s to 382s(>12%).
It would be very valuable to do this kind of widening reading/checking.


More information about the Gcc-bugs mailing list