[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison
guojiufu at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu May 21 05:29:51 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
--- Comment #26 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
Had a test on spec2017 xz_r by changing the specified loop manually, on
ppc64le.
original loop (this loops occur three times in code):
while (++len != len_limit)
if (pb[len] != cur[len])
break;
changed to loop:
typedef long long __attribute__((may_alias)) TYPEE;
for(++len; len + sizeof(TYPEE) <= len_limit; len += sizeof(TYPEE)) {
long long a = *((TYPEE*)(cur+len));
long long b = *((TYPEE*)(pb+len));
if (a != b) {
break; //to optimize len can be move forward here.
}
}
for (;len != len_limit; ++len)
if (pb[len] != cur[len])
break;
We can see xz_r runtime improved from 433s to 382s(>12%).
It would be very valuable to do this kind of widening reading/checking.
More information about the Gcc-bugs
mailing list