[Bug c/21239] New: Illegal elimination of SSE2 load/store using xmm intrinsics
kurt at garloff dot de
gcc-bugzilla@gcc.gnu.org
Tue Apr 26 23:46:00 GMT 2005
/** intrin.c
*
* gcc-4.0 misoptimizes the _mm_load_sd() away with
* -O1 (x86-64), with or without -m32 -msse2.
*
* (c) Kurt Garloff <garloff@suse.de>, Artistic v2
*/
#include <stdlib.h>
#include <emmintrin.h>
#ifdef WORKAROUND
# define ACCESS(X) asm("": : "x"(X))
#else
# define ACCESS(X)
#endif
void do_copy(const unsigned int ln, double* const dst,
const double* const src)
{
int i = ln;
const register double *s = src;
register double *d = dst;
__m128d TMP;
while (i) {
TMP = _mm_load_sd(s);
ACCESS(TMP);
_mm_store_sd(d, TMP);
--i; ++s; ++d;
}
}
int main()
{
unsigned int i;
double *a, *b ,*c;
a = (double*) malloc(19*sizeof(double));
b = (double*) malloc(19*sizeof(double));
for (i = 0; i < 19; ++i) {
a[i] = 1; b[i] = 2;
}
do_copy(19, a, b);
return (a[18] != 2);
}
The test program should return 0, which it does if gcc-3.3/3.4 is used or if
compiled with -DWORKAROUND. gcc-4.0, 4_0-branch, HEAD, and
tree-profiling-branch all fail: The _mm_load_sd() is optimized away.
I guess the compiler does not consider the _mm_store_sd() as a consumer of
the vector register. Adding the fake consumer asm(""::x(XMMREG)); helps thus.
Compiling with -m32 -msse2 exposes the same problem, I have a strong suspicion
the native compiler on x86 would have the same problem.
Here's the wrong assembly produced by gcc-4.0 (on x86-64, using -O2):
do_copy:
.LFB495:
testl %edi, %edi
jne .L8
rep ; ret
.p2align 4,,7
.L8:
xorl %eax, %eax
.p2align 4,,7
.L4:
incl %eax
movq $0, (%rsi)
addq $8, %rsi
cmpl %eax, %edi
jne .L4
rep ; ret
... and here the correct assembly with -DWORKAROUND added:
do_copy:
.LFB495:
testl %edi, %edi
jne .L8
rep ; ret
.p2align 4,,7
.L8:
xorl %eax, %eax
.p2align 4,,7
.L4:
movsd (%rdx), %xmm0
incl %eax
movlpd %xmm0, (%rsi)
addq $8, %rdx
addq $8, %rsi
cmpl %eax, %edi
jne .L4
rep ; ret
--
Summary: Illegal elimination of SSE2 load/store using xmm
intrinsics
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: kurt at garloff dot de
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: x86_64-suse-linux
GCC host triplet: x86_64-suse-linux
GCC target triplet: x86_64-suse-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21239
More information about the Gcc-bugs
mailing list