[Bug target/91824] unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Jan 30 08:42:00 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91824

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:d37c81f476c17d292943189335d745c3fb817b7d

commit r10-6346-gd37c81f476c17d292943189335d745c3fb817b7d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Jan 30 09:41:00 2020 +0100

    i386: Optimize {,v}{,p}movmsk{b,ps,pd} followed by sign extension [PR91824]

    Some time ago, patterns were added to optimize move mask followed by zero
    extension from 32 bits to 64 bit.  As the testcase shows, the intrinsics
    actually return int, not unsigned int, so it will happen quite often that
    one actually needs sign extension instead of zero extension.  Except for
    vpmovmskb with 256-bit operand, sign vs. zero extension doesn't make a
    difference, as we know the bit 31 will not be set (the source will have 2
or
    4 doubles, 4 or 8 floats or 16 or 32 chars).
    So, for the floating point patterns, this patch just uses a code iterator
    so that we handle both zero extend and sign extend, and for the byte one
    adds a separate pattern for the 128-bit operand.

    2020-01-30  Jakub Jelinek  <jakub@redhat.com>

        PR target/91824
        * config/i386/sse.md
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_zext): Renamed to ...
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_<u>ext): ... this.  Use
        any_extend code iterator instead of always zero_extend.
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_zext_lt): Renamed to ...
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_<u>ext_lt): ... this.
        Use any_extend code iterator instead of always zero_extend.
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_zext_shift): Renamed to
...
        (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_<u>ext_shift): ... this.
        Use any_extend code iterator instead of always zero_extend.
        (*sse2_pmovmskb_ext): New define_insn.
        (*sse2_pmovmskb_ext_lt): New define_insn_and_split.

        * gcc.target/i386/pr91824-2.c: New test.


More information about the Gcc-bugs mailing list