[Bug c++/91940] __builtin_bswap16 loop optimization
jakub at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Sep 30 17:17:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2019-09-30
CC| |jakub at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The loop with the rotate is vectorized, while the one with __builtin_bswap16 is
not. For rotates if the ISA doesn't have vector support for rotates, we use
vect_recog_rotate_pattern to undo the matching of hand written rotate into a
rotate by breaking it up again into shifts + blend.
For __builtin_bswap* we have vectorizable_bswap support but it only works if
there is no type promotion in the call argument; in such case it is not handled
using rotates etc., but as a permutation of the vector elements (if supported).
Unfortunately, for __builtin_bswap16 the argument is promoted.
So, the options are look through the argument promotion for vectorizable_bswap,
or in tree-vect-patterns.c pattern match the __builtin_bswap16 on a promoted
integer to a call with non-promoted argument, and optionally check if the
permutation would be supported and maybe fall back to rotate that
vect_recog_rotate_pattern can produce.
More information about the Gcc-bugs
mailing list