[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri Oct 23 06:21:21 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #15)
> CI with -march=cascadelake reports
>
..
> FAIL: gcc.target/i386/avx2-vpcmpeqq-2.c execution test
expands
(gdb) p debug_tree (exp)
<vector_cst 0x7ffff4bbd390
type <vector_type 0x7ffff683a3f0
type <boolean_type 0x7ffff683a348 public QI
size <integer_cst 0x7ffff680cdc8 constant 8>
unit-size <integer_cst 0x7ffff680cde0 constant 1>
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff683a348 precision:2 min <integer_cst 0x7ffff66c9df8 -2> max <integer_cst
0x7ffff66ff288 1>>
QI size <integer_cst 0x7ffff680cdc8 8> unit-size <integer_cst
0x7ffff680cde0 1>
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff683a3f0 nunits:4>
constant tree_0 npatterns:2 nelts-per-pattern:2
elt:0: <integer_cst 0x7ffff4b9f738 type <boolean_type 0x7ffff683a348>
constant 0>
elt:1: <integer_cst 0x7ffff4b9f678 type <boolean_type 0x7ffff683a348>
constant -1> elt:2: <integer_cst 0x7ffff4b9f738 0> elt:3: <integer_cst
0x7ffff4b9f738 0>>
which shows the heuristic cannot work. We possibly can refine it to
key on mode-precision component types - which _might_ work since it seems
x86 uses the smallest integer mode to hold nunits bits - but that's of course
not something guaranteed for non-x86.
I wonder why we're insisting to "fill" the mask mode on GENERIC/GIMPLE
while RTL produces packed bits. Thus, why do we use a
QImode vector(4) <signed-boolean:2> here instead of a
QImode vector(4) <signed-boolean:1> if the target in the end will produce that
from say, a V4SImode compare-to-mask? As long as we didn't expose
temporaries of those types this was well-hidden up to RTL expansion which
then did "magic" but now we're really facing inconsistent representations.
Now targets _could_ opt to use QImode vector(4) <signed-boolean:2> but then
with representing { -1, -1, -1, -1 } as 0b11111111 (with the 'padding bits'
sign-extended).
For now I'm going to revert the patch but I still believe
const_scalar_mask_from_tree is a red herring.
More information about the Gcc-bugs
mailing list