[Bug target/105354] __builtin_shuffle for alignr generates suboptimal code unless SSSE3 is enabled

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon May 9 13:23:29 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105354

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:fcda0efccad41eba9134c1bd9d024a93d93fb82f

commit r13-210-gfcda0efccad41eba9134c1bd9d024a93d93fb82f
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Apr 27 16:24:44 2022 +0800

    Implement permutation with pslldq + psrldq + por when pshufb is not
available.

    pand/pandn may be used to clear upper/lower bits of the operands, in
    that case there will be 4-5 instructions for permutation, and it's
    still better than scalar codes.

    gcc/ChangeLog:

            PR target/105354
            * config/i386/i386-expand.cc
            (expand_vec_perm_pslldq_psrldq_por): New function.
            (ix86_expand_vec_perm_const_1): Try
            expand_vec_perm_pslldq_psrldq_por for both 3-instruction and
            4/5-instruction sequence.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr105354-1.c: New test.
            * gcc.target/i386/pr105354-2.c: New test.


More information about the Gcc-bugs mailing list