[Bug target/96906] Failure to optimize __builtin_ia32_psubusw128 compared to 0 to __builtin_ia32_pminuw128 compared to operand

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Dec 3 05:45:05 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:70310982492071f98eacdac0747521769b0f0328

commit r11-5697-g70310982492071f98eacdac0747521769b0f0328
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Nov 30 13:27:16 2020 +0800

    Optimize vpsubusw compared to 0 into vpcmpleuw or vpcmpnleuw [PR96906]

    For signed comparisons, it handles cases that are eq or neq to 0.
    For unsigned comparisons, it additionaly handles cases that are le or
    gt to 0(equivilent to eq or neq to 0). Transform case eq to leu,
    case neq to gtu.

    .i.e. for -mavx512bw -mavx512vl transform eq case code from

            vpsubusw        %xmm1, %xmm0, %xmm0
            vpxor   %xmm1, %xmm1, %xmm1
            vpcmpeqw  %xmm1, %xmm0, %k0
    to
            vpcmpleuw       %xmm1, %xmm0, %k0

    .i.e. for -mavx512bw -mavx512vl transform neq case code from

            vpsubusw        %xmm1, %xmm0, %xmm0
            vpxor   %xmm1, %xmm1, %xmm1
            vpcmpneqw  %xmm1, %xmm0, %k0
    to
            vpcmpnleuw       %xmm1, %xmm0, %k0

    gcc/ChangeLog
            PR target/96906
            * config/i386/sse.md
            (<avx512>_ucmp<mode>3<mask_scalar_merge_name>): Add a new
            define_split after this insn.

    gcc/testsuite/ChangeLog

            * gcc.target/i386/avx512bw-pr96906-1.c: New test.
            * gcc.target/i386/pr96906-1.c: Add -mno-avx512f.


More information about the Gcc-bugs mailing list