[Bug target/59539] New: Missed optimisation: VEX-prefixed operations don't need aligned data
thiago at kde dot org
gcc-bugzilla@gcc.gnu.org
Wed Dec 18 00:50:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59539
Bug ID: 59539
Summary: Missed optimisation: VEX-prefixed operations don't
need aligned data
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: thiago at kde dot org
Consider the following code:
#include <immintrin.h>
int f(void *p1, void *p2)
{
__m128i d1 = _mm_loadu_si128((__m128i*)p1);
__m128i d2 = _mm_loadu_si128((__m128i*)p2);
__m128i result = _mm_cmpeq_epi16(d1, d2);
return _mm_movemask_epi8(result);
}
If compiled with -O2 -mavx, it produces the following code with GCC 4.9
(current trunk):
f:
vmovdqu (%rdi), %xmm0
vmovdqu (%rsi), %xmm1
vpcmpeqw %xmm1, %xmm0, %xmm0
vpmovmskb %xmm0, %eax
ret
One of the two VMOVDQU are unnecessary, since the VEX-prefixed VCMPEQW
instruction can do unaligned loads without faulting. The Intel Software
Developer's Manual Volume 1, Chapter 14 says in 14.9 "Memory alignment":
> With the exception of explicitly aligned 16 or 32 byte SIMD load/store instructions, most VEX-encoded,
> arithmetic and data processing instructions operate in a flexible environment regarding memory address
> alignment, i.e. VEX-encoded instruction with 32-byte or 16-byte load semantics will support unaligned load
> operation by default. Memory arguments for most instructions with VEX prefix operate normally without
> causing #GP(0) on any byte-granularity alignment (unlike Legacy SSE instructions). The instructions that
> require explicit memory alignment requirements are listed in Table 14-22.
Clang and ICC have already implemente this optimisation:
Clang 3.3 produces:
f: # @f
vmovdqu (%rsi), %xmm0
vpcmpeqw (%rdi), %xmm0, %xmm0
vpmovmskb %xmm0, %eax
ret
Similarly, ICC 14 produces:
f:
vmovdqu (%rdi), %xmm0
vpcmpeqw (%rsi), %xmm0, %xmm1
vpmovmskb %xmm1, %eax
ret
More information about the Gcc-bugs
mailing list