[Bug tree-optimization/94092] Code size and performance degradations after -ftree-loop-distribute-patterns was enabled at -O[2s]+

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Feb 24 07:43:57 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94092

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yes, it was expected that the patch cannot handle all cases since most
definitely the ldist transform loses information on the access that is
otherwise used to improve alignment info.  I suggested to add
__builtin_mem{set,cpy,move} variants with an extra argument specifying the
(common, for cpy/move) 'alignment size'.

Note that for unbound loop bounds we want to dispatch to libc memset even
when we know much about alignment since libc is expected to have optimal
code sequences for a variety of alignment/size combinations.  As said elsewhere
I believe we have code that can dynamically dispatch between a short inline
sequence and a libcall dependent on the actual length but I don't remember
whether this is/was in generic code or in target specific code (but I think
it is done only when value-profile data is available).


More information about the Gcc-bugs mailing list