Bug 8878 - [3.3 only] miscompilation with -O and SSE
[3.3 only] miscompilation with -O and SSE
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: rtl-optimization
3.2.1
: P3 normal
: 3.3.1
Assigned To: Aldy Hernandez
: monitored, wrong-code
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2002-12-09 09:56 UTC by kronoz
Modified: 2003-07-24 19:13 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2003-06-30 22:26:12


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kronoz 2002-12-09 09:56:04 UTC
gcc produce wrong assembly when using vector instructions through built-in 
functions and -O. I was benchmarking pure FPU vs. SSE, so the program is a
loop that multiply and add 2 vector.

Release:
3.2.1

Environment:
System: Linux dreamland 2.4.20-rc3-xfs-acpi #9 Wed Nov 27 18:01:32 CET 2002 i686 unknown
Architecture: i686

	
host: i686-pc-linux-gnu
build: i686-pc-linux-gnu
target: i686-pc-linux-gnu
configured with: ../gcc/configure --prefix=/usr --enable-shared --with-slibdir=/lib --with-gnu-as --with-gnu-ld --enable-threads --enable-languages=c,c++

How-To-Repeat:
#include <stdio.h>

int main(void) {
	typedef int v4sf __attribute__ ((mode(V4SF)));
	v4sf a = {2.0, 2.0, 2.0, 2.0}; 
	v4sf b = {1.0, 2.0, 3.0, 4.0};
	v4sf c = {0.0, 0.0, 0.0, 0.0};
	v4sf d;
	int i;
	
	for(i = 0; i < 1000000; i++) {
		d = __builtin_ia32_mulps(a, b);
		c = __builtin_ia32_addps(c, d);
	}
	
	for (i = 0; i < 4; i++)
		printf("%f ", *(((float *)(&c)) + i));
	printf("\n");
	
	return 0;
}

Compiled with: gcc -Wall -ofloat-sse -march=athlon-xp float-sse.c it
gives me the following (correct) output:

kronos:~/c$ float-sse
2000000.000000 4000000.000000 6000000.000000 8000000.000000

Compiled with: gcc -O -Wall -ofloat-sse -march=athlon-xp float-sse.c it
gives a wrong output:

kronos:~/c$ float-sse
8000000.000000 0.000000 0.000000 0.000000

Assembly output available if needed.
Comment 1 kronoz 2002-12-09 09:56:04 UTC
Fix:
Don't know.
Comment 2 Volker Reichelt 2002-12-11 06:08:13 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: Confirmed.
    
    Here's a cleaned-up testcase (which does not suffer from
    aliasing problems as the original one, where &c is casted):
    
    ----------------------------snip here-------------------------
    #include <stdio.h>
    
    typedef int v4sf __attribute__((mode(V4SF)));
    
    int main(void)
    {
        v4sf v = {1.0, 2.0, 3.0, 4.0};
        union { v4sf v; float f[4]; } u;
    
        u.v = __builtin_ia32_mulps(v,v);
    
        printf("%f %f %f %f\n", u.f[0], u.f[1], u.f[2], u.f[3]);
    
        return 0;
    }
    ----------------------------snip here-------------------------
    
    Compiling this with "gcc -O -msse" on a i686-pc-linux-gnu machine
    results in an executable that prints
    
       16.000000 0.000000 0.000000 0.000000
    
    instead of
    
       1.000000 4.000000 9.000000 16.000000
    
    as expeceted. Replacing "mulps" by "addps" will generate equally wrong results.
Comment 3 janis187 2003-04-02 11:44:27 UTC
From: Janis Johnson <janis187@us.ibm.com>
To: Volker Reichelt <reichelt@igpm.rwth-aachen.de>
Cc: janis187@us.ibm.com, gcc-gnats@gcc.gnu.org, gcc-bugs@gcc.gnu.org,
   kronoz@tiscali.it
Subject: Re: optimization/8878: miscompilation with -O and SSE
Date: Wed, 2 Apr 2003 11:44:27 -0800

 On Wed, Apr 02, 2003 at 09:17:49PM +0200, Volker Reichelt wrote:
 > Hi Janis,
 > 
 > PR 8878 got fixed on mainline in the last couple of days (somewhere
 > between 2003-03-29 and 2003-04-02).
 > Since the bug is in the category "wrong-code" it would be nice
 > if the patch could be backported to 3.3 (and maybe even 3.2) to
 > prevent silent miscompilations.
 > 
 > Could you please identify the patch that fixed the problem?
 > Maybe the following testcase is more convenient for the hunt,
 > since you can check the return value instead of the output:
 > 
 > ------------------------snip here----------------------------
 > typedef int v4sf __attribute__((mode(V4SF)));
 > 
 > int main(void)
 > {
 >     v4sf v = {1.0, 2.0, 3.0, 4.0};
 >     union { v4sf v; float f[4]; } u;
 > 
 >     u.v = __builtin_ia32_mulps(v,v);
 > 
 >     return u.f[0];
 > }
 > ------------------------snip here----------------------------
 > 
 > Just compile with "gcc -march=i686 -msse -O".
 
 I don't know much about x86 architectures and what's
 supposed to work where, but on my Pentium III this gets
 "Illegal instruction" when run, using a mainline compiler
 from sources updated yesterday.
 
 Janis

Comment 4 Volker Reichelt 2003-04-02 21:17:49 UTC
From: Volker Reichelt <reichelt@igpm.rwth-aachen.de>
To: janis187@us.ibm.com
Cc: gcc-gnats@gcc.gnu.org, gcc-bugs@gcc.gnu.org, kronoz@tiscali.it
Subject: Re: optimization/8878: miscompilation with -O and SSE
Date: Wed, 02 Apr 2003 21:17:49 +0200 (CEST)

 Hi Janis,
 
 PR 8878 got fixed on mainline in the last couple of days (somewhere
 between 2003-03-29 and 2003-04-02).
 Since the bug is in the category "wrong-code" it would be nice
 if the patch could be backported to 3.3 (and maybe even 3.2) to
 prevent silent miscompilations.
 
 Could you please identify the patch that fixed the problem?
 Maybe the following testcase is more convenient for the hunt,
 since you can check the return value instead of the output:
 
 ------------------------snip here----------------------------
 typedef int v4sf __attribute__((mode(V4SF)));
 
 int main(void)
 {
     v4sf v = {1.0, 2.0, 3.0, 4.0};
     union { v4sf v; float f[4]; } u;
 
     u.v = __builtin_ia32_mulps(v,v);
 
     return u.f[0];
 }
 ------------------------snip here----------------------------
 
 Just compile with "gcc -march=i686 -msse -O".
 
 Thanks,
 Volker
 
 http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=8878
 
 

Comment 5 janis187 2003-04-03 11:36:47 UTC
From: Janis Johnson <janis187@us.ibm.com>
To: gcc-gnats@gcc.gnu.org, gcc-bugs@gcc.gnu.org, nobody@gcc.gnu.org,
   gcc-prs@gcc.gnu.org, kronoz@tiscali.it, aldyh@redhat.com
Cc:  
Subject: Re: optimization/8878: miscompilation with -O and SSE
Date: Thu, 03 Apr 2003 11:36:47 -0800

 This was fixed on mainline by the following patch, whose
 date is actually 2003-04-01:
 
 2003-02-31  Aldy Hernandez  <aldyh@redhat.com>
 
         * testsuite/gcc.c-torture/execute/simd-3.c: New.
 
         * expr.c (expand_expr): Handle VECTOR_CST.
           (const_vector_from_tree): New.
 
         * varasm.c (output_constant): Handle VECTOR_CST.
 
         * c-typeck.c (digest_init): Build a vector constant from a
           VECTOR_TYPE.
 
         [lots of changes to files in config/rs6000]
 
 My hunt used the testcase that Volker sent me yesterday
 and checked that the generated .s file did not include
 'mulps'.
 
 http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=8878
 
 
 
Comment 6 Volker Reichelt 2003-05-02 16:34:14 UTC
Responsible-Changed-From-To: unassigned->aldyh
Responsible-Changed-Why: This problem is probably in your domain.
    
    Can you fix that on the 3.3 branch or is this pure 3.4 stuff
    (where it is fixed already)?
    
    BTW, I now get an ICE on the 3.3 branch (as of 20030502):
    PR8878.c: In function `main':
    PR8878.c:18: internal compiler error: in subreg_hard_regno, at emit-rtl.c:931
    Please submit a full bug report, [etc.]
Comment 7 Dara Hazeghi 2003-07-06 02:00:47 UTC
Note that the ICE on 3.3 branch is now gone, but the code is still wrong.
Comment 8 Volker Reichelt 2003-07-09 13:41:20 UTC
I still get the ICE on the branch (20030709).
Maybe that's because I'm using --enable-checking.
Comment 9 Aldy Hernandez 2003-07-24 19:13:28 UTC
fixed by:

http://gcc.gnu.org/ml/gcc-patches/2003-07/msg02401.html