[Bug target/101021] New: PSHUFB is emitted instead of PSHUFD, PSHUFLW and PSHUFHW with -msse4

ubizjak at gmail dot com gcc-bugzilla@gcc.gnu.org
Thu Jun 10 18:07:14 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101021

            Bug ID: 101021
           Summary: PSHUFB is emitted instead of PSHUFD, PSHUFLW and
                    PSHUFHW with -msse4
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
typedef char S;
typedef S VV __attribute__((vector_size(16 * sizeof(S))));

VV ref_perm_pshufd (VV x, VV y)
{
  return __builtin_shuffle (x, y, (VV) { 8,9,10,11, 8,9,10,11, 8,9,10,11,
12,13,14,15 });
}

VV ref_perm_pshuflw (VV x)
{
  return __builtin_shuffle (x, (VV) { 0,1, 2,3, 2,3, 6,7, 8,9,10,11,12,13,14,15
});
}

VV ref_perm_pshufhw (VV x)
{
  return __builtin_shuffle (x, (VV) { 0,1,2,3,4,5,6,7, 8,9, 10,11, 10,11, 14,15
});
--cut here--

compiles with -O2 -msse2 to:

<ref_perm_pshufd>:

     pshufd $0xea,%xmm0,%xmm0
     retq   

<ref_perm_pshuflw>:

     pshuflw $0xd4,%xmm0,%xmm0
     retq   

<ref_perm_pshufhw>:

     pshufhw $0xd4,%xmm0,%xmm0
     retq   

Using -msse4 (or higher ISA), the compiler is too eager to emit less optimal
PSHUFB:

<ref_perm_pshufd>:

     pshufb 0x0(%rip),%xmm0
     retq   

<ref_perm_pshuflw>:

     pshufb 0x0(%rip),%xmm0
     retq   

<ref_perm_pshufhw>:

     pshufb 0x0(%rip),%xmm0
     retq


More information about the Gcc-bugs mailing list