[Bug target/96906] Failure to optimize __builtin_ia32_psubusw128 compared to 0 to __builtin_ia32_pminuw128 compared to operand
cvs-commit at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Dec 3 05:45:05 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:70310982492071f98eacdac0747521769b0f0328
commit r11-5697-g70310982492071f98eacdac0747521769b0f0328
Author: liuhongt <hongtao.liu@intel.com>
Date: Mon Nov 30 13:27:16 2020 +0800
Optimize vpsubusw compared to 0 into vpcmpleuw or vpcmpnleuw [PR96906]
For signed comparisons, it handles cases that are eq or neq to 0.
For unsigned comparisons, it additionaly handles cases that are le or
gt to 0(equivilent to eq or neq to 0). Transform case eq to leu,
case neq to gtu.
.i.e. for -mavx512bw -mavx512vl transform eq case code from
vpsubusw %xmm1, %xmm0, %xmm0
vpxor %xmm1, %xmm1, %xmm1
vpcmpeqw %xmm1, %xmm0, %k0
to
vpcmpleuw %xmm1, %xmm0, %k0
.i.e. for -mavx512bw -mavx512vl transform neq case code from
vpsubusw %xmm1, %xmm0, %xmm0
vpxor %xmm1, %xmm1, %xmm1
vpcmpneqw %xmm1, %xmm0, %k0
to
vpcmpnleuw %xmm1, %xmm0, %k0
gcc/ChangeLog
PR target/96906
* config/i386/sse.md
(<avx512>_ucmp<mode>3<mask_scalar_merge_name>): Add a new
define_split after this insn.
gcc/testsuite/ChangeLog
* gcc.target/i386/avx512bw-pr96906-1.c: New test.
* gcc.target/i386/pr96906-1.c: Add -mno-avx512f.
More information about the Gcc-bugs
mailing list