This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/70138] [6 Regression] wrong code at -O3 on x86_64-linux-gnu


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70138

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Further improved testcase (just decrease number of iterations somewhat, and
make sure the u elements that are summed are different in each outer loop
iteration, to verify the vectorizer doesn't just multiply value from some
iteration by the number of iterations):

double u[333 * 333];

__attribute__((noinline, noclone)) static void
foo (int *x)
{
  double c = 0.0;
  int a, b;
  for (a = 0; a < 333; a++)
    {
      for (b = 0; b < 333; b++)
        c = c + u[334 * a];
      u[334 * a] *= 2.0;
    }
  *x = c;
}

int
main ()
{
  int d, e;
  for (d = 0; d < 333 * 333; d++)
    u[d] = 499.0;
  for (d = 0; d < 333; d++)
    u[d * 334] = (d + 2);
  foo (&e);
  if (e != 333 * (2 + 334) / 2 * 333)
    __builtin_abort ();
  return 0;
}

BTW, I'm really surprised we vectorize this even without -Ofast, it is a double
reduction, therefore reducing it causes different floating point operations
between the vectorized and non-vectorized cases.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]