This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/40553] New: wrong result(nan) using vector extensions on athlon-xp
- From: "CaptainSifff at gmx dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 25 Jun 2009 18:48:51 -0000
- Subject: [Bug middle-end/40553] New: wrong result(nan) using vector extensions on athlon-xp
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
I'm using the code from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40550
and added output of the result vector:
#include <cstdio>
typedef float v2sf __attribute__ ((vector_size (2 * sizeof(float))));
int main()
{
v2sf a = {1.0, 0.0};
v2sf b = {0.0, 1.0};
v2sf d;
d = a + b;
float* dp = (float*) &d;
printf("%f %f \n", dp[0], dp[1]);
return 0;
}
the vector d contains (nan, nan) compiled with g++ -march=athlon-xp opposed to
(1,1) if compiled without flags. Looking through the generated assembler code I
found that the compiler happens to use the %mm registers. So for good measure I
added a call to femms() after the addition to flush the multimedia-state:
#include <cstdio>
typedef float v2sf __attribute__ ((vector_size (2 * sizeof(float))));
int main()
{
v2sf a = {1.0, 0.0};
v2sf b = {0.0, 1.0};
v2sf d;
d = a + b;
float* dp = (float*) &d;
__builtin_ia32_femms();
printf("%f %f \n", dp[0], dp[1]);
return 0;
}
et voila, the program gives the true answer. But as gcc is also rummaging
around in the SSE registers and seems to do the actual addition on the x87-FPU,
this might not be the true solution. Note that as in the other bug a double
version of this code works fine. Not also that optimized versions of this code
work too, but this seems due to optimizing the addition away.
--
Summary: wrong result(nan) using vector extensions on athlon-xp
Product: gcc
Version: 4.3.3
Status: UNCONFIRMED
Severity: minor
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: CaptainSifff at gmx dot de
GCC build triplet: i686-linux-gnu
GCC host triplet: i686-linux-gnu
GCC target triplet: i686-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40553