[Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal

ubizjak at gmail dot com gcc-bugzilla@gcc.gnu.org
Tue Jul 13 10:55:08 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434

            Bug ID: 101434
           Summary: vector-by-vector left shift expansion for char/short
                    is not optimal
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
short r[8], a[8], b[8];

void f1 (void)
{
  int i;

  for (i = 0; i < 8; i++)
    r[i] = a[i] << b[i];
}
--cut here--

compiles with -O2 -ftree-vectorize -mxop to:

        vmovdqa a(%rip), %xmm0
        vmovdqa b(%rip), %xmm1
        vpmovsxwd       %xmm0, %xmm2
        vpsrldq $8, %xmm0, %xmm0
        vpmovsxwd       %xmm1, %xmm3
        vpsrldq $8, %xmm1, %xmm1
        vpshad  %xmm3, %xmm2, %xmm2
        vpmovsxwd       %xmm0, %xmm0
        vpmovsxwd       %xmm1, %xmm1
        vpshad  %xmm1, %xmm0, %xmm0
        vpperm  .LC0(%rip), %xmm0, %xmm2, %xmm2
        vmovdqa %xmm2, r(%rip)
        ret

SImode vpshad is used together with lots of other instructions, but a HImode
vpshaw should be emitted instead.

Similar testcase:

--cut here--
short r[8], a[8], b[8];

void f2 (void)
{
  int i;

  for (i = 0; i < 8; i++)
    r[i] = a[i] >> b[i];
}
--cut here--

results in expected HImode vect-by-vect shift insn:

        vpxor   %xmm0, %xmm0, %xmm0
        vpsubw  b(%rip), %xmm0, %xmm0
        vpshaw  %xmm0, a(%rip), %xmm0
        vmovdqa %xmm0, r(%rip)
        ret

(do not bother with vpxor and vpsubw, these are just one of XOP peculiarities.)


More information about the Gcc-bugs mailing list