[Bug tree-optimization/103116] SLP vectoriser fails to peel for gaps
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue May 3 13:19:58 GMT 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103116
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue is we have group_size == 2 but nunits == 4 but still gap == 0.
That makes get_group_load_store_type assume overrun_p = false.
I suppose that when we'd have 8 elements in x and four times the first and
second in y peeling one vector iteration as scalar is not enough to avoid the
breakage.
So while peeling for gaps in this particular case helps it's not the solution
for the more general problem. Here instead I think we need to enforce a
minimum vectorization factor so that nunits divides group_size * vf (or at
least
nunits/2 does to allow peeling for gaps to work).
ISTR we specifically did not do this to allow more vectorization though. The
better alternative would then be to allow a smaller vector size to be used
for the load with all the ripple down effects that might have (and only
enforce a larger VF if there is no such vector type).
More information about the Gcc-bugs
mailing list