This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/68655] New: SSE2 cannot vec_perm of low and high part
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 02 Dec 2015 13:55:48 +0000
- Subject: [Bug target/68655] New: SSE2 cannot vec_perm of low and high part
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68655
Bug ID: 68655
Summary: SSE2 cannot vec_perm of low and high part
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
CC: uros at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
typedef unsigned short v8hi __attribute__((vector_size(16)));
v8hi foo (v8hi a, v8hi b)
{
return __builtin_shuffle (a, b, (v8hi) { 0, 1, 2, 3, 8, 9, 10, 11 });
}
should be able to use
movlhps %xmm0, %xmm1
ret
but ends up being lowered by vector lowering because the target says
it cannot can_vec_perm_p (V8HI, false, { 0, 1, 2, 3, 8, 9, 10, 11 })
There are also two-instruction permutes possible with movhl/lhps
like { 0, 1, 2, 3, 12, 13, 14, 15 } can use
movhlps %xmm1, %xmm1
movlhps %xmm0, %xmm1
ah, that uses shufpd. Not sure why the above doesn't use shufpd if that
is available in SSE2.