Bug 14981 - [3.4 Regression] ICE in _mm_xor_pd for SSE2 with -O1
Summary: [3.4 Regression] ICE in _mm_xor_pd for SSE2 with -O1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.4.0
: P2 normal
Target Milestone: 3.4.4
Assignee: Richard Henderson
URL:
Keywords: ice-on-valid-code, monitored, patch
: 14982 15010 18779 20051 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-04-16 18:38 UTC by davide rossetti
Modified: 2005-05-09 22:55 UTC (History)
13 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work: 4.0.0 3.4.4
Known to fail:
Last reconfirmed: 2004-10-08 18:48:27


Attachments
proposed patch (299 bytes, patch)
2004-10-16 00:01 UTC, Richard Henderson
Details | Diff
patch I'll be testing tomorrow (755 bytes, patch)
2005-05-09 22:55 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description davide rossetti 2004-04-16 18:38:51 UTC
tried on snapshot 3.4-20040414 but exists since 3.4-20040218
3.3.2 20040108 (Red Hat Linux 3.3.2-6) is OK
CVS HEAD from 3.5 is OK

ICE is:
gcc_snapshot/build/bin/gcc -v -save-temps -O1 -S -msse2 -mfpmath=sse v2dfxor3.c
...
 /beatle/home1/sw/gcc_snapshot/build/libexec/gcc/i686-pc-linux-gnu/3.4.0/cc1
-fpreprocessed v2dfxor3.i -quiet -dumpbase v2dfxor3.c -msse2 -mfpmath=sse
-mtune=pentiumpro -auxbase v2dfxor3 -O1 -version -o v2dfxor3.s
GNU C version 3.4.0 20040414 (prerelease) (i686-pc-linux-gnu)
        compiled by GNU C version 3.3.2 20040108 (Red Hat Linux 3.3.2-6).
GGC heuristics: --param ggc-min-expand=64 --param ggc-min-heapsize=64266
v2dfxor3.c: In function `xorv2df3':
v2dfxor3.c:10: internal compiler error: in immed_double_const, at emit-rtl.c:481
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

it is caused by immed_double_const() mistakenly been called by
simplify_immed_subreg(). the latter is not ready to take TImode in outer_class ...


the test is not worth an attachment:

#include <emmintrin.h>
__v2df res;
void xorv2df3(double *x)
{
        __v2df    temp0={x[0],x[1]};
	__v2df    temp1={0x0, 0x0};
        res = _mm_xor_pd(temp0, temp1);
}
Comment 1 Wolfgang Bangerth 2004-04-16 18:51:39 UTC
*** Bug 14982 has been marked as a duplicate of this bug. ***
Comment 2 Wolfgang Bangerth 2004-04-16 18:54:01 UTC
Confirmed. ICEs in 3.3 and 3.4, but not in mainline. 3.2 didn't have 
the respective header file. 
 
W. 
Comment 3 Andrew Pinski 2004-04-16 19:15:40 UTC
Jan I think one of your vector patches for the mainline fixed this would you mind looking into which one 
and backport it for 3.4.1? Otherwise we can just close this as fixed for 3.5.0.
Comment 4 Andrew Pinski 2004-04-19 14:19:00 UTC
*** Bug 15010 has been marked as a duplicate of this bug. ***
Comment 5 Giovanni Bajo 2004-04-20 01:45:30 UTC
Given Wolfgang comment, this appears to be a regression on the 3.3 branch as 
well (works in 3.3.2, fails in 3.3.3). We might want to backport the patch 
there as well.
Comment 6 Andrew Pinski 2004-04-20 02:08:34 UTC
hmm, I can reproduce it on a very late (meaning from right before the branch of 3.4.0) but I cannot 
reproduce it on a plain 3.3.3 or the mainline.
Here is output of gcc -v for 3.3.3:
Reading specs from /home/gates/pinskia/ia32_linux_gcc3_3/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/
specs
Configured with: ../configure --prefix=/home/gates/pinskia/ia32_linux_gcc3_3
Thread model: posix
gcc version 3.3.3
 /home/gates/pinskia/ia32_linux_gcc3_3/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/cc1 -quiet -v 
-D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=3 pr14981.c -quiet -dumpbase 
pr14981.c -msse2 -mfpmath=sse -auxbase pr14981 -O1 -version -o /tmp/ccydmdaN.s
GNU C version 3.3.3 (i686-pc-linux-gnu)
        compiled by GNU C version 3.3.3.
Comment 7 Wolfgang Bangerth 2004-04-20 04:25:21 UTC
I have the following versions on my system: 
- SuSE gcc 3.3.1, which doesn't have emmintrin.h 
- gcc 3.3.4 pre 20040402, which ICEs (see below) 
- gcc 3.4 pre 20040402, which does as well 
- gcc 3.5 pre 20040402 which is OK. 
Note that the ICEs in 3.3.4 and 3.4 happen in different places. 
W. 
 
 
g/x> /home/bangerth/bin/gcc-3.3.4-pre/bin/gcc -O1 -S -msse2 -mfpmath=sse -c 
x.c 
x.c: In function `xorv2df3': 
x.c:8: internal compiler error: in subreg_hard_regno, at emit-rtl.c:932 
Please submit a full bug report, 
with preprocessed source if appropriate. 
See <URL:http://gcc.gnu.org/bugs.html> for instructions. 
 
g/x> /home/bangerth/bin/gcc-3.4-pre/bin/gcc -O1 -S -msse2 -mfpmath=sse -c x.c 
x.c: In function `xorv2df3': 
x.c:8: internal compiler error: in immed_double_const, at emit-rtl.c:481 
Please submit a full bug report, 
with preprocessed source if appropriate. 
See <URL:http://gcc.gnu.org/bugs.html> for instructions. 
 
 
Comment 8 Andrew Pinski 2004-04-20 04:33:34 UTC
Ok, this makes better sense now, so this is a regression on 3.3 branch from 3.3.3 and a regression on 
the 3.4.0 branch as well.  Thanks Wolfgang for clarification.
Comment 9 Giovanni Bajo 2004-06-06 03:53:19 UTC
Retargeting to 3.4.1, being a regression on that release branch.
Comment 10 Mark Mitchell 2004-06-21 21:16:47 UTC
Postponed until GCC 3.4.2.
Comment 11 Volker Reichelt 2004-08-17 02:51:05 UTC
Here's a more detailed description what's happening:
The testcase can be reduced to

==========================================================
typedef int v2df __attribute__ ((mode (V2DF)));

v2df foo(double d)
{
    v2df x={0.0,d};
    v2df y={0.0,0.0};
    return __builtin_ia32_xorpd(x,y);
}
==========================================================

which crashes the compiler when compiled with "gcc -O -msse2".
The crash on the 3.3 branch (3.3 - 3.3.5) is in subreg_hard_regno,
the crash in the 3.4 branch (3.4.0 - 3.4.2) is immed_double_const
(both in emit-rtl.c).

These are actuelly two distinct problems, one with non-constant parameters,
and one with constant parameters which can be shown with the following
testcases:

The first one triggers the 3.3.x failure:
==========================================================
typedef int v2df __attribute__ ((mode (V2DF)));

v2df foo(double d)
{
    v2df x={0.0,d};
    return __builtin_ia32_xorpd(x,x);
}
==========================================================
Since the mode V2DF was introduced in 3.3 this is not a regression,
but a fix on the 3.3 branch would be nice anyway IMHO.

The second one triggers the 3.4.x failure:
==========================================================
typedef int v2df __attribute__ ((mode (V2DF)));

void foo()
{
    v2df x={0.0,0.0};
    __builtin_ia32_xorpd(x,x);
}
==========================================================
Since this worked in 3.3 this is a 3.4 regression.


Btw, the header file emmintrin.h was introduced in 3.3.3, so the
original example couldn't be compiled in 3.3 - 3.3.2.
The 3.3.2 version mentioned in the original bug report seems to be
Red Hat's own version with additional patches.


Alas that's not all. By specifying "-msse" instead of "-O -msse2"
the last testcase crashes the 3.4 branch and mainline in
extract_constrain_insn_cached, at recog.c:2000 (thus a 3.4/3.5 regression)
If I don't even use "-msse", the 3.3 branch fails in
ix86_function_arg_boundary, at config/i386/i386.c:2414.

Btw, I removed the known-to-work/fail entries since they differ
from testcase to testcase.
Comment 12 Mark Mitchell 2004-08-29 18:12:02 UTC
Postponed until GCC 3.4.3.
Comment 13 Mark Mitchell 2004-08-29 18:14:44 UTC
Postponed until GCC 3.4.3.
Comment 14 Richard Henderson 2004-10-16 00:01:56 UTC
Created attachment 7362 [details]
proposed patch

The bug here is that, somewhere, we lost the fact that we need to force
HOST_WIDE_INT to be 64-bits wide, so that 2*HWI can represent TImode
constants for SSE.  I could have sworn this was already enabled, but that's
demonstrably not true.

Yes, this will slow down the compiler, perhaps sigificantly, but any 
other solution will require substantial modifications to representations.
Comment 15 Andrew Pinski 2004-10-21 03:17:15 UTC
Actually it will not slow down the compiler that much on the mainline because of my fix for PR 13987.
Comment 16 Mark Mitchell 2004-11-01 00:45:36 UTC
Postponed until GCC 3.4.4.
Comment 17 uros 2004-11-30 15:59:47 UTC
gcc version 4.0.0 20041130 (experimental) does not ICE in any testcase, so this
is not a 4.0 regression.
Comment 18 Volker Reichelt 2004-12-01 00:42:56 UTC
> gcc version 4.0.0 20041130 (experimental) does not ICE in any testcase,
> so this is not a 4.0 regression.

Well, that's not quite true. The following testcase still crashes
on mainline when compiled with "gcc -msse" (see also second-to-last
paragraph in comment #11):

======================================================
typedef double v2df __attribute__((vector_size(16)));

void foo()
{
    v2df x={0.0,0.0};
    __builtin_ia32_xorpd(x,x);
}
======================================================
Comment 19 uros 2004-12-01 06:50:28 UTC
(In reply to comment #18)

> Well, that's not quite true. The following testcase still crashes
> on mainline when compiled with "gcc -msse"

Ah, thanks. I have checked mainline with -msse2, as the code you quote is
actually a sse2 code. With -msse, it crashes for all optimization levels, and
with -msse2, mainline does not ice for me.

Following code is a sse version, which compiles OK for -msse and -msse2:

typedef float v4sf __attribute__ ((vector_size (16)));

void
foo ()
{
  v4sf x = { 0.0, 0.0, 0.0, 0.0 };
  __builtin_ia32_xorpd (x, x);
}
Comment 20 Volker Reichelt 2004-12-02 09:49:27 UTC
Richard's patch http://gcc.gnu.org/ml/gcc-cvs/2004-12/msg00026.html
fixed the problem on mainline, so that all the problems mentioned in
this PR seem to be fixed on mainline.

We still have the 3.4 regression, though: The testcase from comment #18
still ICEs with "gcc -O -msse2" or "gcc -O -msse".
Comment 21 Serge Belyshev 2004-12-02 10:30:45 UTC
*** Bug 18779 has been marked as a duplicate of this bug. ***
Comment 22 Volker Reichelt 2004-12-03 10:18:32 UTC
Richard backported his fix for PR15289 (mentioned in comment #20)
to the 3.4 branch. This also fixed the problem when compiling
the testcase in comment #18 with "-msse".

The problem compiling the testcase with "-O -msse2" remains on the 3.4
branch, though.
Comment 23 Volker Reichelt 2004-12-03 11:50:57 UTC
The regression was introduced by Geoff's patch
http://gcc.gnu.org/ml/gcc-cvs/2004-01/msg00195.html
(before the 3.4 branch)

This was fixed on mainline by Jan's patch
http://gcc.gnu.org/ml/gcc-cvs/2004-02/msg00944.html
(after the 3.4 branch)

Jan, do you think a backport is feasible?
Comment 24 Eric Botcazou 2004-12-05 09:30:50 UTC
The fix for PR rtl-opt/15289 has been reverted on the 3.4 branch.
Comment 25 Volker Reichelt 2004-12-13 12:00:45 UTC
Since Richard's patch 
http://gcc.gnu.org/ml/gcc-cvs/2004-12/msg00494.html 
the testcase in comment #18 fails again on mainline. 
 
Comment 26 Eric Botcazou 2004-12-14 08:52:56 UTC
A bug involving SSE intrinsics is not critical.
Comment 27 Richard Henderson 2004-12-15 04:09:04 UTC
We should not ice, indeed, but the comment #18 test case -- with -msse 
but not -msse2 -- is invalid.  We should have stopped earlier, saying
that v2df is not available with sse1 only.
Comment 28 Richard Henderson 2004-12-15 12:17:46 UTC
I lied.  Of course the test case is valid.  Even without any sse enabled, the
test case falls back on generic vectors, and the builtin becomes a normal
function call.  There is something screwy going on, but we've definitely left
the bounds of the original bug report, so I'd like to handle this separately.

The *original* report does indeed work with current sources.
Comment 29 Richard Henderson 2004-12-15 12:24:35 UTC
Followup in PR 19010.
Comment 30 Uroš Bizjak 2005-03-21 06:21:02 UTC
*** Bug 20051 has been marked as a duplicate of this bug. ***
Comment 31 Uroš Bizjak 2005-03-21 13:40:37 UTC
The original testcase from description fails again with GNU C version 3.4.4
20050321 (prerelease) (i686-pc-linux-gnu). It compiles OK with mainline.

PR 19010 compiles OK with both compilers.

Reopened as 3.4 regression.
Comment 32 Uroš Bizjak 2005-03-22 08:42:52 UTC
3.4 patch, backported from mainline:
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02057.html
Comment 33 GCC Commits 2005-03-22 15:54:14 UTC
Subject: Bug 14981

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	uros@gcc.gnu.org	2005-03-22 15:53:59

Modified files:
	gcc            : ChangeLog simplify-rtx.c 
	gcc/config/i386: i386.md 
	gcc/testsuite  : ChangeLog 
Added files:
	gcc/testsuite/gcc.dg: pr14981-1.c 

Log message:
	PR target/14981
	Backport from mainline
	2004-02-18  Jan Hubicka  <jh@suse.cz>
	* simplify-rtx.c (simplify_unary_operation): Deal with logicals on
	floats.
	(simplify_binary_operation): Deal with logicals on floats.
	
	* i386.md (SSE fabs splitters): Emit new patterns.
	(SSE cmov splitters): Likewise.
	(sse_andv4sf3, sse_nandv4sf3, sse_iorv4sf3, sse_xorv4sf3
	(sse_andv2df3, sse_nandv2df3, sse_iorv2df3, sse_xorv2df3): Do not use
	subregs.
	(sse_andsf3, sse_nandsf3, sse_xorsf3): Kill.
	(sse_anddf3, sse_nanddf3, sse_xordf3): Kill.
	
	testsuite:
	
	PR target/14981
	* gcc.dg/pr14981-1.c: New test.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.824&r2=2.2326.2.825
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/simplify-rtx.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.172.4.4&r2=1.172.4.5
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.502.2.16&r2=1.502.2.17
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3389.2.375&r2=1.3389.2.376
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/pr14981-1.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=NONE&r2=1.1.2.1

Comment 34 Uroš Bizjak 2005-03-22 16:00:32 UTC
Fixed also on 3.4 branch, finally.
Comment 35 Jakub Jelinek 2005-05-09 16:50:40 UTC
This patch creates regressions on i?86:
FAIL: gcc.dg/20020523-1.c (test for excess errors)
WARNING: gcc.dg/20020523-1.c compilation failed to produce executable
FAIL: gcc.dg/20020523-2.c (test for excess errors)
WARNING: gcc.dg/20020523-2.c compilation failed to produce executable

./xgcc -B ./ -v -m32 -march=pentium3 -msse -ffast-math -O2 20020523-1.c
Reading specs from ./specs
Configured with: ../configure --enable-languages=c,c++
Thread model: posix
gcc version 3.4.4 20050509 (prerelease)
 ./cc1 -quiet -v -iprefix
/usr/src/gcc-3.4/obj/gcc/../lib/gcc/x86_64-unknown-linux-gnu/3.4.4/ -isystem
./include 20020523-1.c -quiet -dumpbase 20020523-1.c -m32 -march=pentium3 -msse
-auxbase 20020523-1 -O2 -version -ffast-math -o /tmp/ccwFAGHO.s
ignoring nonexistent directory
"/usr/src/gcc-3.4/obj/gcc/../lib/gcc/x86_64-unknown-linux-gnu/3.4.4/include"
ignoring nonexistent directory
"/usr/src/gcc-3.4/obj/gcc/../lib/gcc/x86_64-unknown-linux-gnu/3.4.4/../../../../x86_64-unknown-linux-gnu/include"
ignoring nonexistent directory "NONE/include"
ignoring nonexistent directory
"/usr/local/lib/gcc/x86_64-unknown-linux-gnu/3.4.4/include"
ignoring nonexistent directory
"/usr/local/lib/../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 ./include
 /usr/local/include
 /usr/include
End of search list.
GNU C version 3.4.4 20050509 (prerelease) (x86_64-unknown-linux-gnu)
        compiled by GNU C version 3.4.3 20050227 (Red Hat 3.4.3-22.fc3).
GGC heuristics: --param ggc-min-expand=98 --param ggc-min-heapsize=128053
20020523-1.c: In function `main':
20020523-1.c:65: error: could not split insn
(insn:TI 108 123 112 (set (reg:SF 22 xmm1)
        (if_then_else:SF (gt (reg:SF 22 xmm1)
                (mem/u/f:SF (symbol_ref/u:SI ("*.LC2") [flags 0x2]) [2 S4 A32]))
            (reg:SF 21 xmm0 [76])
            (const_double:SF 0.0 [0x0.0p+0]))) 663 {*sse_movsfcc_const0_1}
(insn_list 123 (insn_list 122 (insn_list:REG_DEP_ANTI 106 (nil))))
    (expr_list:REG_DEAD (reg:SF 21 xmm0 [76])
        (expr_list:REG_EQUIV (mem/f:SF (reg/f:SI 7 sp) [0 S4 A32])
            (nil))))
20020523-1.c:65: internal compiler error: in final_scan_insn, at final.c:2429
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

Reverting this patch fixes it.
Comment 36 Jakub Jelinek 2005-05-09 22:55:35 UTC
Created attachment 8847 [details]
patch I'll be testing tomorrow

There seem to be multiple bugs in the backport.
One is that in 3.4.x the insns use VOIDmode on the GT etc. operator, so
with :SF resp. :DF on MATCH_OPERATOR the splitter never matches.
Another bug is a typo in the sse_movsfcc*const* splitter, it creates
(and:V4SF (reg:SF ...) (reg:V4SF ...)) which obviously doesn't match anything.
And also, the gen* programs report missing mode on source (i.e. IF_THEN_ELSE)
in all the 4 patterns.