[Bug tree-optimization/69908] recognizing idioms that check for a buffer of all-zeros could make *much* better code

glisse at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Jul 10 09:48:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69908

--- Comment #8 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Yuri Gribov from comment #7)
> Hm, I've just tried r249806 both with -ftree-loop-distribution and
> -fno-tree-loop-distribution on top of flags above without any changes in
> output. This may depend on revision/flags/machine, which ones did you use?

On x86_64

$ cat a.c
void memcpy_(char * __restrict a, char * __restrict b, unsigned n) {
  unsigned i;
  for (i = 0; i < n; ++i)
    a[i] = b[i];
}
$ gcc-7 a.c -O3 -S -fdump-tree-optimized && cat a.c.227t.optimized
[...]
  <bb 2> [15.00%]:
  if (n_8(D) != 0)
    goto <bb 3>; [85.00%]
  else
    goto <bb 4>; [15.00%]

  <bb 3> [12.75%]:
  _17 = n_8(D) + 4294967295;
  _21 = (sizetype) _17;
  _20 = _21 + 1;
  __builtin_memcpy (a_10(D), b_9(D), _20); [tail call]

  <bb 4> [15.00%]:
  return;
[...]
$ gcc-7 a.c -O3 -S -fdump-tree-optimized -fdisable-tree-ldist && cat
a.c.227t.optimized
[...]
  <bb 22> [68.85%]:
  # ivtmp.21_193 = PHI <ivtmp.21_194(22), 0(21)>
  # ivtmp.24_195 = PHI <ivtmp.24_196(22), 0(21)>
  vect__4.13_55 = MEM[base: vectp_b.12_52, index: ivtmp.24_195, offset: 0B];
  MEM[base: vectp_a.15_56, index: ivtmp.24_195, offset: 0B] = vect__4.13_55;
  ivtmp.21_194 = ivtmp.21_193 + 1;
  ivtmp.24_196 = ivtmp.24_195 + 16;
  if (bnd.8_48 > ivtmp.21_194)
    goto <bb 22>; [83.34%]
  else
    goto <bb 23>; [16.66%]
[...]

(at first glance, the +1 vs +16 is surprising)


More information about the Gcc-bugs mailing list