This is the mail archive of the
gcc-prs@gcc.gnu.org
mailing list for the GCC project.
optimization/8878: miscompilation with -O and SSE
- From: Luca "Kronos" Tettamanti <kronoz at tiscali dot it>
- To: gcc-gnats at gcc dot gnu dot org
- Date: Mon, 9 Dec 2002 18:53:16 +0100 (CET)
- Subject: optimization/8878: miscompilation with -O and SSE
>Number: 8878
>Category: optimization
>Synopsis: miscompilation with -O and SSE
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: unassigned
>State: open
>Class: wrong-code
>Submitter-Id: net
>Arrival-Date: Mon Dec 09 09:56:04 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:
>Release: 3.2.1
>Organization:
>Environment:
System: Linux dreamland 2.4.20-rc3-xfs-acpi #9 Wed Nov 27 18:01:32 CET 2002 i686 unknown
Architecture: i686
host: i686-pc-linux-gnu
build: i686-pc-linux-gnu
target: i686-pc-linux-gnu
configured with: ../gcc/configure --prefix=/usr --enable-shared --with-slibdir=/lib --with-gnu-as --with-gnu-ld --enable-threads --enable-languages=c,c++
>Description:
gcc produce wrong assembly when using vector instructions through built-in
functions and -O. I was benchmarking pure FPU vs. SSE, so the program is a
loop that multiply and add 2 vector.
>How-To-Repeat:
#include <stdio.h>
int main(void) {
typedef int v4sf __attribute__ ((mode(V4SF)));
v4sf a = {2.0, 2.0, 2.0, 2.0};
v4sf b = {1.0, 2.0, 3.0, 4.0};
v4sf c = {0.0, 0.0, 0.0, 0.0};
v4sf d;
int i;
for(i = 0; i < 1000000; i++) {
d = __builtin_ia32_mulps(a, b);
c = __builtin_ia32_addps(c, d);
}
for (i = 0; i < 4; i++)
printf("%f ", *(((float *)(&c)) + i));
printf("\n");
return 0;
}
Compiled with: gcc -Wall -ofloat-sse -march=athlon-xp float-sse.c it
gives me the following (correct) output:
kronos:~/c$ float-sse
2000000.000000 4000000.000000 6000000.000000 8000000.000000
Compiled with: gcc -O -Wall -ofloat-sse -march=athlon-xp float-sse.c it
gives a wrong output:
kronos:~/c$ float-sse
8000000.000000 0.000000 0.000000 0.000000
Assembly output available if needed.
>Fix:
Don't know.
>Release-Note:
>Audit-Trail:
>Unformatted: