GCC Bugzilla – Bug 19391
[4.0 Regression] missed optimization with size of 8 vectors
Last modified: 2005-07-23 22:49:57 UTC
We do not generate the vector instructions with the following code on x86_64 (and x86 -msse3):
typedef short mmxw __attribute__ ((vector_size(8)));
typedef int mmxdw __attribute__ ((vector_size(8)));
The code comes from PR 14552 but we don't use the vector unit any more for the addition so we
produce so much crappy code:
movq w(%rip), %xmm0
movabsq $-9223231297218904064, %rax
movq %xmm0, -8(%rsp)
movq -8(%rsp), %rsi
movq %rsi, %rcx
movq %rsi, %rdx
xorq %rsi, %rcx
andq %rax, %rcx
movabsq $9223231297218904063, %rax
andq %rax, %rdx
addq %rdx, %rdx
xorq %rdx, %rcx
movq %rcx, -16(%rsp)
movq -16(%rsp), %xmm0
movq %xmm0, w(%rip)
movq %xmm0, dw(%rip)
Compared to what we got in 3.4:
movq w, %mm1
psllw $1, %mm1
movq %mm1, w
movq w, %mm0
movq %mm0, dw
Note this worked with 20041124.
Is this a regression that may be caused by RTH's rewrite?
Try again with <mmintrin.h> functions.
It is absolutely ESSENTIAL that we do NOT emit mmx vector operations UNLESS
the <mmintrin.h> routines are used. Doing so without the compiler also
being able to insert emms instructions is wrong-code.
(In reply to comment #3)
> Try again with <mmintrin.h> functions.
> It is absolutely ESSENTIAL that we do NOT emit mmx vector operations UNLESS
> the <mmintrin.h> routines are used. Doing so without the compiler also
> being able to insert emms instructions is wrong-code.
Ok, but please add something to the changes page as I had thought I had saw this wording in a patch
but I did not for sure.