I'm observing a wrong code generation bug with the 189.lucas benchmark in SPEC CPU2000. When the Fortran benchmark is being compiled with -O1 -fno-ira-share-spill-slots, the benchmark outputs the following: iteration= 2 000000000000000E iteration= 3 00000000000000C2 iteration= 4 0000000000009302 iteration= 5 00000000546B4C02 iteration= 6 1BD696D9F03D3002 M75460003 Roundoff warning on iteration 7 maxerr = 0.499999970088 FATAL ERROR...Halting execution. When compiled with -O1, this is (part of) the output (which is correct, as specified by the CPU2000 framework): iteration= 2 000000000000000E iteration= 3 00000000000000C2 iteration= 4 0000000000009302 iteration= 5 00000000546B4C02 iteration= 6 1BD696D9F03D3002 iteration= 7 8CC88407A9F4C002 iteration= 8 55599F9D37D30002 <snip> iteration= 122 E9639F5835FD3C2C exponent residue 75460003 DDD9C8B13BCB64AE I'm sorry I'm unable to provide a self-contained testcase for this, but I know little about Fortran. I hope someone else can jump in and provide a testcase for me.
Richard, can you try to reproduce this? I don't have SPEC and anyhow it sounds like a middle-end problem.
I did some more experiments, and have some more details to share. It seems the problem with lucas only occurs with the SVN head of the 4.4 branch I'm working on (r148268), and not with the 4.4.0 release. However, a similar problem is occuring with the 172.mgrid benchmark, and this _is_ occuring both with the 4.4.0 release and with revision 148268 of the 4.4 branch. Again, -O1 -fno-ira-share-spill-slots leads to a binary not producing correct output (but there's not noticeable crash of the benchmark this time). Using -O1 yields no problems though. The SPEC framework reports the below as a diff between the expected and observed output for mgrid, and considers the difference to be serious enough to report the run to be invalid. 0019: 0.103090E-02 -0.907513E-03 ^ 0020: 0.103090E-02 -0.907513E-03 ^ 0021: 0.184495E-02 0.261074E-02 ^ 0022: 0.184495E-02 0.261074E-02 ^ 0023: 0.366257E-03 -0.677032E-04 ^ 0024: 0.366257E-03 -0.677032E-04 ^ 0025: 0.436098E-03 0.179609E-03 ^ 0026: 0.436098E-03 0.179609E-03 ^ 0027: 0.442029E-03 0.212079E-03 ^ 0028: 0.442029E-03 0.212079E-03 ^ 0029: 0.442962E-03 0.217171E-03 ^
Same problem with 187.facerec, 173.applu and 301.apsi: runs correctly at -O1, wrong code at -O1 -fno-ira-share-spill-slots. All these benchmarks are written Fortran (both F77 and F90), so it seems this might be Fortran related.
Are you running in 32bit mode? Vlad, what does this IRA option do?
(In reply to comment #4) > Are you running in 32bit mode? No, I'm not. Using either -m32 or -m64 makes no difference for lucas, and if I'm either m32 or m64, then I still obtain a 64-bit binary (when not using -fno-ira-share-spill-slots), so definitely 64-bit mode.
Some more related details which might help shed light on the cause behind this. The 178.galgel benchmark (again, Fortran), is also being miscompiled, but now using -ffixed-form -fno-ira-share-spill-slots -fno-tree-loop-im in combination with -O3 or -Os. Note that -O1 and -O2 are working fine with these options specified. (-ffixed-form is always needed to compile galgel, so this one is probably of minor importance here) On top of this, additionally specifying -fno-tree-dominator-opts resolves the issue at -O3 (corrext code is being emitted), but this is not the case at -Os, where the miscompile still occurs.
Running check on gcc/g++ shows further miscompilations with -fno-ira-share-spill-slots (as of r158131, x86_64-linux): gcc.c-torture/execute/20021120-1.c FAILs with: -O2 -fno-ira-share-spill-slots or -O1 -foptimize-register-move -fno-ira-share-spill-slots gcc.c-torture/execute/pr28982a.c FAILs with: -O1 -fno-ira-share-spill-slots gcc.dg/graphite/interchange-8.c FAILs with: -O2 -fpeel-loops -fno-ira-share-spill-slots gcc.c-torture/execute/regstack-1.c FAILs with: -Os -fschedule-insns -fno-ira-share-spill-slots -fno-sched-critical-path-heuristic There are further FAILs with more complicated compiler flags needed to reproduce, and it's also possible I haven't checked all -fno-ira-share-spill-slots miscompilations
Created attachment 20370 [details] execution tests that FAIL with -fno-ira-share-spill-slots r158225, x86_64-linux, languages=c,c++,lto,fortran $ make check RUNTESTFLAGS="--target_board=unix/-fno-ira-share-spill-slots" $ cat gcc/testsuite/*/*.log | grep '^FAIL:' | grep 'exec' &> pr40386.txt Without duplicates, this is the list of files that at least once fail the execution test: c-c++-common/torture/complex-sign-mixed-div.c gcc.c-torture/execute/builtins/pr22237.c gcc.c-torture/execute/pr23135.c gcc.c-torture/execute/pr28982a.c gcc.c-torture/execute/pr28982b.c gcc.c-torture/execute/20020508-2.c gcc.c-torture/execute/20020508-3.c gcc.c-torture/execute/20021120-1.c gcc.dg/guality/inline-params.c gcc.dg/vect/vect-strided-u8-i8-gap4.c gcc.target/x86_64/abi/test_passing_floats.c gfortran.dg/alloc_comp_assign_2.f90 gfortran.dg/alloc_comp_assign_3.f90 gfortran.dg/eoshift_large_1.f90 gfortran.dg/func_derived_1.f90 gfortran.dg/reshape_rank7.f90 gfortran.fortran-torture/execute/intrinsic_cshift.f90 g++.old-deja/g++.eh/ia64-1.C There were no ICEs caused by this flag.
The problem is in that pseudos (r121 in our case) spilled by IRA are not added to live_throughout of reload chain. As the result, pseudo_forbidden_regs are not set up for such pseudos and they can get a hard registers (42 in our case) even if they live through insns (insn 153 in our case) using reload (0th in our case) with this register when another pseudo is spilled and reload ask IRA to assign the correspodning hard register to other pseudo. Here are some parts of IRA dump: Spilling for insn 153. Using reg 2 for reload 1 Using reg 42 for reload 0 ... Spilling for insn 238. Using reg 2 for reload 0 Spill 117(a35), cost=5000 Spilled regs 117 Try assign 121(a6), cost=5000: reassign to 42 The fix is pretty simple. I'll send it soon.
Subject: Bug 40386 Author: vmakarov Date: Thu Sep 9 13:42:51 2010 New Revision: 164095 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164095 Log: 2010-09-08 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * ira.c (pseudo_for_reload_consideration_p): Don't use flag_ira_share_spill_slots. 2010-09-08 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.c branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.x Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/ira.c branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
Subject: Bug 40386 Author: vmakarov Date: Thu Sep 9 13:47:14 2010 New Revision: 164097 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164097 Log: 2010-09-08 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * ira.c (pseudo_for_reload_consideration_p): Don't use flag_ira_share_spill_slots. 2010-09-08 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.c branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.x Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/ira.c branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Subject: Bug 40386 Author: vmakarov Date: Thu Sep 9 13:51:25 2010 New Revision: 164100 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164100 Log: 2010-09-09 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * ira.c (pseudo_for_reload_consideration_p): Don't use flag_ira_share_spill_slots. 2010-09-09 Vladimir Makarov <vmakarov@redhat.com> PR middle-end/40386 * gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr40386.c trunk/gcc/testsuite/gcc.c-torture/execute/pr40386.x Modified: trunk/gcc/ChangeLog trunk/gcc/ira.c trunk/gcc/testsuite/ChangeLog
It seems all the testsuite failures caused by -fno-ira-share-spill-slots and gone now, good!
Fixed.