[Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Tue Jul 13 10:55:08 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
Bug ID: 101434
Summary: vector-by-vector left shift expansion for char/short
is not optimal
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
short r[8], a[8], b[8];
void f1 (void)
{
int i;
for (i = 0; i < 8; i++)
r[i] = a[i] << b[i];
}
--cut here--
compiles with -O2 -ftree-vectorize -mxop to:
vmovdqa a(%rip), %xmm0
vmovdqa b(%rip), %xmm1
vpmovsxwd %xmm0, %xmm2
vpsrldq $8, %xmm0, %xmm0
vpmovsxwd %xmm1, %xmm3
vpsrldq $8, %xmm1, %xmm1
vpshad %xmm3, %xmm2, %xmm2
vpmovsxwd %xmm0, %xmm0
vpmovsxwd %xmm1, %xmm1
vpshad %xmm1, %xmm0, %xmm0
vpperm .LC0(%rip), %xmm0, %xmm2, %xmm2
vmovdqa %xmm2, r(%rip)
ret
SImode vpshad is used together with lots of other instructions, but a HImode
vpshaw should be emitted instead.
Similar testcase:
--cut here--
short r[8], a[8], b[8];
void f2 (void)
{
int i;
for (i = 0; i < 8; i++)
r[i] = a[i] >> b[i];
}
--cut here--
results in expected HImode vect-by-vect shift insn:
vpxor %xmm0, %xmm0, %xmm0
vpsubw b(%rip), %xmm0, %xmm0
vpshaw %xmm0, a(%rip), %xmm0
vmovdqa %xmm0, r(%rip)
ret
(do not bother with vpxor and vpsubw, these are just one of XOP peculiarities.)
More information about the Gcc-bugs
mailing list