[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Jul 18 10:55:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
t.ii:25:19: note: === vect_analyze_data_ref_accesses ===
t.ii:25:19: note: Detected interleaving store _10->x and _10->y
t.ii:25:19: note: Detected interleaving load MEM[(const struct Foo &)_8].x and
MEM[(const struct Foo &)_8].y
t.ii:25:19: note: Detected interleaving store of size 2 starting with _10->x =
_37;
t.ii:25:19: note: Detected interleaving load of size 3 starting with _37 =
MEM[(const struct Foo &)_8].x;
t.ii:25:19: note: There is a gap of 1 elements after the group
...
t.ii:25:19: note: Final SLP tree for instance:
t.ii:25:19: note: node
t.ii:25:19: note:       stmt 0 _10->x = _37;
t.ii:25:19: note:       stmt 1 _10->y = _38;
t.ii:25:19: note: node
t.ii:25:19: note:       stmt 0 _37 = MEM[(const struct Foo &)_8].x;
t.ii:25:19: note:       stmt 1 _38 = MEM[(const struct Foo &)_8].y;

(note no load permutation)

t.ii:25:19: note: Loop contains SLP and non-SLP stmts
t.ii:25:19: note: Updating vectorization factor to 4
t.ii:25:19: note: vectorization_factor = 4, niters = 5

  _37 = MEM[(const struct Foo &)_8].x;
  vect__37.14_78 = MEM[(long int *)vectp.12_80];
  vectp.12_73 = vectp.12_80 + 16;
  vect__37.15_72 = MEM[(long int *)vectp.12_73];
  vectp.12_71 = vectp.12_73 + 16;
  vect__37.16_70 = MEM[(long int *)vectp.12_71];
  vectp.12_69 = vectp.12_71 + 16;
  vect__37.17_68 = MEM[(long int *)vectp.12_69];
  vectp.12_67 = vectp.12_69 + 32;
  _38 = MEM[(const struct Foo &)_8].y;

so the gap is accounted for in the wrong place once instead of twice as
required.

C testcase:

typedef __UINT64_TYPE__ uint64_t;
uint64_t x[24];
uint64_t y[16];
uint64_t z[8];

void __attribute__((noinline)) foo()
{
  for (int i = 0; i < 8; ++i)
    {
      y[2*i] = x[3*i];
      y[2*i + 1] = x[3*i + 1];
      z[i] = 1;
    }
}

int main()
{
  for (int i = 0; i < 24; ++i)
    x[i] = i;
  foo ();
  for (int i = 0; i < 8; ++i)
    if (y[2*i] != 3*i || y[2*i+1] != 3*i + 1)
      __builtin_abort ();
  return 0;
}


More information about the Gcc-bugs mailing list