[Bug tree-optimization/69908] recognizing idioms that check for a buffer of all-zeros could make *much* better code
glisse at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Jul 10 09:48:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69908
--- Comment #8 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Yuri Gribov from comment #7)
> Hm, I've just tried r249806 both with -ftree-loop-distribution and
> -fno-tree-loop-distribution on top of flags above without any changes in
> output. This may depend on revision/flags/machine, which ones did you use?
On x86_64
$ cat a.c
void memcpy_(char * __restrict a, char * __restrict b, unsigned n) {
unsigned i;
for (i = 0; i < n; ++i)
a[i] = b[i];
}
$ gcc-7 a.c -O3 -S -fdump-tree-optimized && cat a.c.227t.optimized
[...]
<bb 2> [15.00%]:
if (n_8(D) != 0)
goto <bb 3>; [85.00%]
else
goto <bb 4>; [15.00%]
<bb 3> [12.75%]:
_17 = n_8(D) + 4294967295;
_21 = (sizetype) _17;
_20 = _21 + 1;
__builtin_memcpy (a_10(D), b_9(D), _20); [tail call]
<bb 4> [15.00%]:
return;
[...]
$ gcc-7 a.c -O3 -S -fdump-tree-optimized -fdisable-tree-ldist && cat
a.c.227t.optimized
[...]
<bb 22> [68.85%]:
# ivtmp.21_193 = PHI <ivtmp.21_194(22), 0(21)>
# ivtmp.24_195 = PHI <ivtmp.24_196(22), 0(21)>
vect__4.13_55 = MEM[base: vectp_b.12_52, index: ivtmp.24_195, offset: 0B];
MEM[base: vectp_a.15_56, index: ivtmp.24_195, offset: 0B] = vect__4.13_55;
ivtmp.21_194 = ivtmp.21_193 + 1;
ivtmp.24_196 = ivtmp.24_195 + 16;
if (bnd.8_48 > ivtmp.21_194)
goto <bb 22>; [83.34%]
else
goto <bb 23>; [16.66%]
[...]
(at first glance, the +1 vs +16 is surprising)
More information about the Gcc-bugs
mailing list