[Bug tree-optimization/56764] New: vect_prune_runtime_alias_test_list not smart enough
jakub at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Mar 28 13:00:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56764
Bug #: 56764
Summary: vect_prune_runtime_alias_test_list not smart enough
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: jakub@gcc.gnu.org
__attribute__((noinline, noclone)) void
foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float
*z3)
{
int i;
for (i = 0; i < 32; i++)
{
z1[i] = -y1 * x[0][i];
z2[i] = -y2 * x[1][i];
z3[i] = -y3 * x[2][i];
}
}
float x[6][32] __attribute__((aligned (32)));
int
main ()
{
int i;
for (i = 0; i < 32; i++)
{
x[0][i] = i;
x[1][i] = 7 * i;
x[2][i] = -5.5 * i;
}
for (i = 0; i < 100000000; i++)
foo (&x[0], 12.5, 0.5, -1.5, &x[3][0], &x[4][0], &x[5][0]);
return 0;
}
isn't vectorized on x86_64-linux with -O3 -mavx, because there are too many
versioning checks for alias. We vectorize it only with
--param vect-max-version-for-alias-checks=12 . But I don't see why we'd need
to emit that many checks for versioning, instead of the 12 checks for aliasing
we emit we could emit just 6 (keep the 3 overlap checks in between z1, z2 and
z3
and just merge each of the zN vs. &x[0][0], zN vs. &x[1][0] and zN vs. &x[2][0]
tests into one comparing zN[0] though zN[31] range with &x[0][0] through
&x[2][31]. Similarly, if we wanted to do a runtime check for alignment (not
the case on x86_64 apparently), we could test only alignment of &x[0][0],
because
it is provably the same alignment as &x[1][0] and &x[2][0].
More information about the Gcc-bugs
mailing list