[Bug tree-optimization/68558] New: Fails to SLP loop
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Nov 26 14:44:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68558
Bug ID: 68558
Summary: Fails to SLP loop
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Blocks: 53947
Target Milestone: ---
void IMB_double_fast_x (int *destf, int *dest, int y, int *p1f)
{
int i;
for (i = y; i > 0; i--)
{
*dest++ = 0;
destf[0] = p1f[0];
destf[1] = p1f[1];
destf[2] = p1f[2];
destf[3] = p1f[3];
destf[4] = p1f[8];
destf[5] = p1f[9];
destf[6] = p1f[10];
destf[7] = p1f[11];
destf += 8;
p1f += 12;
}
}
fails to SLP because of
t.c:4:3: note: Detected interleaving store of size 8 starting with *destf_37 =
_13;
t.c:4:3: note: Detected interleaving load of size 12 starting with _13 =
*p1f_39;
t.c:4:3: note: Data access with gaps requires scalar epilogue loop
...
t.c:4:3: note: Build SLP failed: the number of interleaved loads is greater
than the SLP group size _13 = *p1f_39;
splitting the load group doesn't help because then we'll hit
t.c:4:3: note: Build SLP failed: differen interleaving chains in one node
splitting the store group to vector-size pieces would generally make sense
but may have interesting effects on SLP discovery like w/o also splitting
loads will hit the first issue above.
The best fix would be to lift the above restrictions and let permutation
support decide whether it can create the required loads or not.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
More information about the Gcc-bugs
mailing list