[Bug target/68655] New: SSE2 cannot vec_perm of low and high part

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Dec 2 13:55:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68655

            Bug ID: 68655
           Summary: SSE2 cannot vec_perm of low and high part
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
                CC: uros at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

typedef unsigned short v8hi __attribute__((vector_size(16)));

v8hi foo (v8hi a, v8hi b)
{
  return __builtin_shuffle (a, b, (v8hi) { 0, 1, 2, 3, 8, 9, 10, 11 });
}

should be able to use

  movlhps %xmm0, %xmm1
  ret

but ends up being lowered by vector lowering because the target says
it cannot can_vec_perm_p (V8HI, false, { 0, 1, 2, 3, 8, 9, 10, 11 })

There are also two-instruction permutes possible with movhl/lhps
like { 0, 1, 2, 3, 12, 13, 14, 15 } can use

  movhlps %xmm1, %xmm1
  movlhps %xmm0, %xmm1

ah, that uses shufpd.  Not sure why the above doesn't use shufpd if that
is available in SSE2.


More information about the Gcc-bugs mailing list