They all fail at runtime on x86-linux only (work on x86_64-linux), it happened between LAST_UPDATED: Fri Sep 16 18:47:13 UTC 2005 LAST_UPDATED: Tue Sep 20 19:42:32 UTC 2005 It looks like it's all fixed point or decimal artihmetic that gets wrong code generated. One example ,.,. C45532J ACATS 2.5 05-09-21 10:53:40 ---- C45532J FIXED POINT OPERATOR "/". * C45532J RESULT OF "/" OUTSIDE RESULT MODEL INTERVAL WHEN USING NO MODEL NUMBERS. **** C45532J FAILED ****************************. Complete list: c45532j c45532l c45532m c45532n c45532o c45532p ce3806g ce3806h ce3810b cxaa008 cxf2001 cxf2002 cxf2003 cxf2004 cxf2005 cxf2a02 cxg2022
Here is a reduced Ada testcase. -gnatdg show use of 64 bits, but in original test case it was not the case, so it might be a slightly different bug. $ cat > p.adb procedure P is type T is delta 1.0 / (2 ** 32) range -1.0 .. 1.0; A : T := 0.25; B : T := 0.75; C : T := A / B; begin if C = 0.0 then raise Program_Error; end if; end P; $ gnatmake p gcc -c p.adb gnatbind -x p.ali gnatlink p.ali $ ./p raised PROGRAM_ERROR : p.adb:8 explicit raise
Created attachment 9787 [details] -fdump-tree-all Andrew request.
"but in original test case it was not the case": I missed but there's also a call to system.arith_64.scaled_divide in the original test case, so this is the likely miscompiled function.
The 17 x86 only regressions were introduced by this patch, Jan asked me to look for generated libcalls, more info soon. 2005-09-18 Jan Hubicka <jh@suse.cz> * calls.c (flags_from_decl_or_type): Do not set ECF_LIBCALL_BLOCK. Index: calls.c =================================================================== RCS file: /opt/gcc/rsync/gcc-cvs/gcc/gcc/calls.c,v retrieving revision 1.400 retrieving revision 1.401 diff -u -r1.400 -r1.401 --- calls.c 31 Aug 2005 03:33:23 -0000 1.400 +++ calls.c 18 Sep 2005 17:14:24 -0000 1.401 @@ -582,7 +582,7 @@ /* The function exp may have the `pure' attribute. */ if (DECL_IS_PURE (exp)) - flags |= ECF_PURE | ECF_LIBCALL_BLOCK; + flags |= ECF_PURE; if (DECL_IS_NOVOPS (exp)) flags |= ECF_NOVOPS; @@ -591,7 +591,7 @@ flags |= ECF_NOTHROW; if (TREE_READONLY (exp) && ! TREE_THIS_VOLATILE (exp)) - flags |= ECF_LIBCALL_BLOCK | ECF_CONST; + flags |= ECF_CONST; flags = special_function_p (exp, flags); } @@ -606,7 +606,7 @@ if (TREE_CODE (type) == FUNCTION_TYPE && TYPE_RETURNS_STACK_DEPRESSED (type)) { flags |= ECF_SP_DEPRESSED; - flags &= ~(ECF_PURE | ECF_CONST | ECF_LIBCALL_BLOCK); + flags &= ~(ECF_PURE | ECF_CONST); } return flags;
Beginning of the -fdump-tree-all diff before and after patch. -- before/s-arit64.adb.00.expand 2005-09-23 20:01:11.000000000 +0200 +++ after/s-arit64.adb.00.expand 2005-09-23 19:41:23.000000000 +0200 @@ -2795,143 +2795,119 @@ (call_insn/u 18 17 19 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:SI ("system__arith_64__hi") [flags 0x3] <function_decl 0x40196d80 system__arith_64__hi>) [0 S1 A8]) (const_int 8 [0x8]))) -1 (nil) - (nil) + (expr_list:REG_EH_REGION (const_int 0 [0x0]) + (nil)) (expr_list:REG_DEP_TRUE (use (mem/i:DI (reg/f:SI 56 virtual-outgoing-args) [0 S8 A32])) (nil))) -(insn 19 18 20 (set (reg:SI 73) +(insn 19 18 0 (set (reg/v:SI 66 [ xhi ]) (reg:SI 0 ax)) -1 (nil) - (expr_list:REG_EQUAL (expr_list:REG_DEP_TRUE (symbol_ref:SI ("system__arith_64__hi") [flags 0x3] <function_decl 0x40196d80 system__arith_64__hi>) - (expr_list:REG_DEP_TRUE (reg/v:DI 67 [ xu ]) - (nil))) - (nil))) - -(insn 20 19 0 (set (reg/v:SI 66 [ xhi ]) - (reg:SI 73)) -1 (nil) (nil)) ;; xlo = system__arith_64__lo (xu) -(insn 22 20 23 (set (mem:DI (reg/f:SI 56 virtual-outgoing-args) [0 S8 A32]) +(insn 21 19 22 (set (mem:DI (reg/f:SI 56 virtual-outgoing-args) [0 S8 A32]) (reg/v:DI 67 [ xu ])) -1 (nil) (nil)) -(call_insn/u 23 22 24 (set (reg:SI 0 ax) +(call_insn/u 22 21 23 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:SI ("system__arith_64__lo") [flags 0x3] <function_decl 0x40196d00 system__arith_64__lo>) [0 S1 A8]) (const_int 8 [0x8]))) -1 (nil) - (nil) + (expr_list:REG_EH_REGION (const_int 0 [0x0]) + (nil)) (expr_list:REG_DEP_TRUE (use (mem/i:DI (reg/f:SI 56 virtual-outgoing-args) [0 S8 A32])) (nil))) ...
Subject: Re: [4.1 Regression] ACATS FAIL 17 regressions on x86-linux, fixed and decimal arithmetic broken > > ------- Additional Comments From laurent at guerby dot net 2005-09-23 19:46 ------- > Beginning of the -fdump-tree-all diff before and after patch. > > -- before/s-arit64.adb.00.expand 2005-09-23 20:01:11.000000000 +0200 > +++ after/s-arit64.adb.00.expand 2005-09-23 19:41:23.000000000 +0200 > @@ -2795,143 +2795,119 @@ > (call_insn/u 18 17 19 (set (reg:SI 0 ax) > (call (mem:QI (symbol_ref:SI ("system__arith_64__hi") [flags 0x3] > <function_decl 0x40196d80 system__arith_64__hi>) [0 S1 A8]) > (const_int 8 [0x8]))) -1 (nil) > - (nil) > + (expr_list:REG_EH_REGION (const_int 0 [0x0]) > + (nil)) While trying to track down this problem I got lost in the handling of EH and libcall blocks. It looks like emit_libcall can detect possibly trapping libcalls but elliminates the the EH_REGION note when -fnon-call-exceptions is set. When it is not the EH_REGION notes are set to -1, while for non-trapping calls in calls.c the region 0 is used. I am somwhat lost in this code and it seems to me that the failures might come from different EH delivery. Honza
A priori no exception is ever raised in this test. Who could help on this one? Is reverting this patch an option? Laurent
Well, if the exception is never raised, the difference in EH code generation is probably not an issue. Reverting the patch is definitly possible (there is nothing dependent on it and except for one or two PRs exposing problems in libcall mechanizm it only results in better code. However this is almost definitly previously latent bug (we just disable RTL level optimizatoin), I would like to have some understanding to what is going wrong. Honza
Ok, no need to revert it right now. I'm trying to reduce it to something standalone that doesn't call the Ada runtime.
It looks like it is indeed a codegen bug in s-arit64.adb, in Scaled_Divide the following line is miscompiled at -O1 and above (works at -O0): T2 := Lo (T1 rem Zlo) & D (4); In my p.adb testcase, the compiler with Jan patch calls Lo with argument = 0 whereas (T1 rem Zlo) returned (1073741824 rem 3221225472) = 1073741824 so stack/argument handling must be confused at some point. The bug is volatile, ie replacing the line above by: S3 := Lo (T1 rem Zlo); if S3 = 0 then T2 := Uns64 (D (4)); else T2 := S3 & D (4); end if; Makes it go away. Laurent
Subject: Re: [4.1 Regression] ACATS FAIL 17 regressions on x86-linux, fixed and decimal arithmetic broken > > > ------- Comment #10 from laurent at guerby dot net 2005-10-02 16:20 ------- > It looks like it is indeed a codegen bug in s-arit64.adb, in Scaled_Divide the > following line is miscompiled at -O1 and above (works at -O0): > > T2 := Lo (T1 rem Zlo) & D (4); > > In my p.adb testcase, the compiler with Jan patch calls Lo with argument = 0 > whereas (T1 rem Zlo) returned (1073741824 rem 3221225472) = 1073741824 so > stack/argument handling must be confused at some point. > > The bug is volatile, ie replacing the line above by: > > S3 := Lo (T1 rem Zlo); > if S3 = 0 then > T2 := Uns64 (D (4)); > else > T2 := S3 & D (4); > end if; > > Makes it go away. Can you please check if -fno-tree-ter makes the bug go away? We've chatted recently with Jakub concerning bug in argument saving in nested call and -maccumulate-outgoing-args. I was under impression that we should no longer produce the nested calls out of ter but aparently it is still being done (that results in inferrior code quality too as we can't generated nested calls very sanely) Also can you also, please, point me closer to place in .optimized dump where the misscompilation happens? THere seems to be couple of lo calls around. I will be mostly offline tomorrow but should be able to fix it day after that. Thanks a lot! Honza
> Can you please check if -fno-tree-ter makes the bug go away? We've > chatted recently with Jakub concerning bug in argument saving in nested > call and -maccumulate-outgoing-args. I was under impression that we > should no longer produce the nested calls out of ter but aparently it is > still being done (that results in inferrior code quality too as we can't > generated nested calls very sanely) To back up your hypothesis, I can say that I'm *not* seeing the failures on i586.
Subject: Re: [4.1 Regression] ACATS FAIL 17 regressions on x86-linux, fixed and decimal arithmetic broken > > > ------- Comment #12 from ebotcazou at gcc dot gnu dot org 2005-10-02 21:55 ------- > > Can you please check if -fno-tree-ter makes the bug go away? We've > > chatted recently with Jakub concerning bug in argument saving in nested > > call and -maccumulate-outgoing-args. I was under impression that we > > should no longer produce the nested calls out of ter but aparently it is > > still being done (that results in inferrior code quality too as we can't > > generated nested calls very sanely) > > To back up your hypothesis, I can say that I'm *not* seeing the failures on > i586. That seems consistent. i586 defaults to -mno-accumulate-outgoing-args. const int x86_accumulate_outgoing_args = m_ATHLON_K8 | m_PENT4 | m_NOCONA | m_PPRO; Honza > > > -- > > ebotcazou at gcc dot gnu dot org changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |ebotcazou at gcc dot gnu dot > | |org > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24003 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is.
I confirm that on p.adb no-tree-ter does make the problem disappear, p.adb works when I compile s-arit64 as follows: gcc -c -O2 -gnatpg -fno-tree-ter s-arit64.adb (I'll be away from email until wednesday).
Created attachment 9897 [details] -fdump-rtl-expand-details s-arit64.adb.00.expand-normal : -O2 -gnatpg s-arit64.adb.00.expand-no-tree-ter : -O2 -gnatpg -fno-tree-ter Limited to function Scaled_Divide.
This most likely can be produced using C code, just has not found a testcase yet.
Ouch! I cannot reproduce with a i586-suse-linux compiler configured with full checking and gccflags="-O2 -march=i686".
Still seeing the failures on SuSE 9.3 i686 machine with "-enable-languages=ada,c --enable-__cxa_atexit --disable-nls --enable-threads=posix --disable-multilib" --enable-checking as of: LAST_UPDATED: Mon Oct 24 17:22:09 UTC 2005 I have some patches from Richard Kenner in my tree though. Eric, did you manage to reproduce the problem at all?
> I have some patches from Richard Kenner in my tree though. Eric, did you manage > to reproduce the problem at all? Nein. :-) But I'll try a bit harder tomorrow and on various machines.
Jan Hubicka told me on IRC that Jakub Jelinek is working on PR23775 which might be related (or the same issue) as this one.
OK, confirmed with a pristine tree on i686.
Jan, > Can you please check if -fno-tree-ter makes the bug go away? We've > chatted recently with Jakub concerning bug in argument saving in nested > call and -maccumulate-outgoing-args. I was under impression that we > should no longer produce the nested calls out of ter but aparently it is > still being done (that results in inferrior code quality too as we can't > generated nested calls very sanely) Would that be an acceptable fix (not rematerializing nested calls)? > Also can you also, please, point me closer to place in .optimized dump > where the misscompilation happens? <L12>:; - temp.456 = d[3]{lb: 1 sz: 4}; - t1.427 = system__arith_64__Oconcat (temp.433, temp.456); - temp.460 = d[4]{lb: 1 sz: 4}; - D.712 = system__arith_64__Orem (t1.427, zlo); - D.713 = system__arith_64__lo (D.712); - t2.450 = system__arith_64__Oconcat (D.713, temp.460); - D.715 = system__arith_64__Odivide (t2.450, zlo); - D.716 = system__arith_64__lo (D.715); - D.717 = system__arith_64__Odivide (t1.427, zlo); - D.718 = system__arith_64__lo (D.717); - qu = system__arith_64__Oconcat (D.718, D.716); + t1.427 = system__arith_64__Oconcat (temp.433, d[3]{lb: 1 sz: 4}); + t2.450 = system__arith_64__Oconcat (system__arith_64__lo (system__arith_64__Orem (t1.427, zlo)), d[4]{lb: 1 sz: 4}); + qu = system__arith_64__Oconcat (system__arith_64__lo (system__arith_64__Odivide (t1.427, zlo)), system__arith_64__lo (system__arith_64__Odivide (t2.450, zlo))); ru = system__arith_64__Orem (t2.450, zlo); goto <bb 37> (<L50>); -: -O1 -fno-tree-ter +: -O1
Do we have a C/C++ testcase for this problem? I'm going to leave this as P2 for now, given that we think it's not language-dependent, and given that we seem close to a fix.
> Do we have a C/C++ testcase for this problem? Yes, we do, attached. > I'm going to leave this as P2 for now, given that we think it's not > language-dependent, and given that we seem close to a fix. Pretty serious bug indeed.
Created attachment 10212 [details] C testcase.
Subject: Bug 24003 Author: ebotcazou Date: Sun Nov 13 09:55:11 2005 New Revision: 106860 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106860 Log: PR middle-end/24003 * calls.c (expand_call): If TARGET is a MEM and some part of the argument area has been saved, force TARGET to a register. Added: trunk/gcc/testsuite/gcc.dg/nested-calls-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/calls.c trunk/gcc/testsuite/ChangeLog
See http://gcc.gnu.org/ml/gcc-patches/2005-11/msg00807.html