[Bug tree-optimization/56764] New: vect_prune_runtime_alias_test_list not smart enough

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Mar 28 13:00:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56764

             Bug #: 56764
           Summary: vect_prune_runtime_alias_test_list not smart enough
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jakub@gcc.gnu.org


__attribute__((noinline, noclone)) void
foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float
*z3)
{
  int i;
  for (i = 0; i < 32; i++)
    {
      z1[i] = -y1 * x[0][i];
      z2[i] = -y2 * x[1][i];
      z3[i] = -y3 * x[2][i];
    }
}

float x[6][32] __attribute__((aligned (32)));

int
main ()
{
  int i;
  for (i = 0; i < 32; i++)
    {
      x[0][i] = i;
      x[1][i] = 7 * i;
      x[2][i] = -5.5 * i;
    }
  for (i = 0; i < 100000000; i++)
    foo (&x[0], 12.5, 0.5, -1.5, &x[3][0], &x[4][0], &x[5][0]);
  return 0;
}

isn't vectorized on x86_64-linux with -O3 -mavx, because there are too many
versioning checks for alias.  We vectorize it only with
--param vect-max-version-for-alias-checks=12 .  But I don't see why we'd need
to emit that many checks for versioning, instead of the 12 checks for aliasing
we emit we could emit just 6 (keep the 3 overlap checks in between z1, z2 and
z3
and just merge each of the zN vs. &x[0][0], zN vs. &x[1][0] and zN vs. &x[2][0]
tests into one comparing zN[0] though zN[31] range with &x[0][0] through
&x[2][31].  Similarly, if we wanted to do a runtime check for alignment (not
the case on x86_64 apparently), we could test only alignment of &x[0][0],
because
it is provably the same alignment as &x[1][0] and &x[2][0].



More information about the Gcc-bugs mailing list