Bug 40386 - wrong code generation for several SPEC CPU2000 benchmarks (lucas, mgrid, face, applu, apsi) with -O1 -fno-ira-share-spill-slots
Summary: wrong code generation for several SPEC CPU2000 benchmarks (lucas, mgrid, face...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: 4.4.5
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-09 12:07 UTC by Kenneth Hoste
Modified: 2012-01-07 21:31 UTC (History)
4 users (show)

See Also:
Host: linux, x86-64
Target: linux, x86-64
Build: linux, x86-64
Known to work:
Known to fail:
Last reconfirmed:


Attachments
execution tests that FAIL with -fno-ira-share-spill-slots (570 bytes, text/plain)
2010-04-12 18:55 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kenneth Hoste 2009-06-09 12:07:22 UTC
I'm observing a wrong code generation bug with the 189.lucas benchmark in SPEC CPU2000.

When the Fortran benchmark is being compiled with -O1 -fno-ira-share-spill-slots, the benchmark outputs the following:

iteration=         2  000000000000000E
iteration=         3  00000000000000C2
iteration=         4  0000000000009302
iteration=         5  00000000546B4C02
iteration=         6  1BD696D9F03D3002
M75460003 Roundoff warning on iteration       7 maxerr =  0.499999970088
 FATAL ERROR...Halting execution.

When compiled with -O1, this is (part of) the output (which is correct, as specified by the CPU2000 framework):

iteration=         2  000000000000000E
iteration=         3  00000000000000C2
iteration=         4  0000000000009302
iteration=         5  00000000546B4C02
iteration=         6  1BD696D9F03D3002
iteration=         7  8CC88407A9F4C002
iteration=         8  55599F9D37D30002
<snip>
iteration=       122  E9639F5835FD3C2C
  exponent     residue
  75460003  DDD9C8B13BCB64AE


I'm sorry I'm unable to provide a self-contained testcase for this, but I know little about Fortran. I hope someone else can jump in and provide a testcase for me.
Comment 1 Tobias Burnus 2009-06-09 12:24:52 UTC
Richard, can you try to reproduce this? I don't have SPEC and anyhow it sounds like a middle-end problem.

Comment 2 Kenneth Hoste 2009-06-09 13:35:12 UTC
I did some more experiments, and have some more details to share.

It seems the problem with lucas only occurs with the SVN head of the 4.4 branch I'm working on (r148268), and not with the 4.4.0 release.

However, a similar problem is occuring with the 172.mgrid benchmark, and this _is_ occuring both with the 4.4.0 release and with revision 148268 of the 4.4 branch.

Again, -O1 -fno-ira-share-spill-slots leads to a binary not producing correct output (but there's not noticeable crash of the benchmark this time). Using -O1 yields no problems though.

The SPEC framework reports the below as a diff between the expected and observed output for mgrid, and considers the difference to be serious enough to report the run to be invalid.

0019:      0.103090E-02
          -0.907513E-03
                      ^
0020:      0.103090E-02
          -0.907513E-03
                      ^
0021:      0.184495E-02
           0.261074E-02
                      ^
0022:      0.184495E-02
           0.261074E-02
                      ^
0023:      0.366257E-03
          -0.677032E-04
                      ^
0024:      0.366257E-03
          -0.677032E-04
                      ^
0025:      0.436098E-03
           0.179609E-03
                      ^
0026:      0.436098E-03
           0.179609E-03
                      ^
0027:      0.442029E-03
           0.212079E-03
                      ^
0028:      0.442029E-03
           0.212079E-03
                      ^
0029:      0.442962E-03
           0.217171E-03
                      ^

Comment 3 Kenneth Hoste 2009-06-09 14:12:06 UTC
Same problem with 187.facerec, 173.applu and 301.apsi: runs correctly at -O1, wrong code at -O1 -fno-ira-share-spill-slots.

All these benchmarks are written Fortran (both F77 and F90), so it seems this might be Fortran related.
Comment 4 Richard Biener 2009-06-09 14:22:11 UTC
Are you running in 32bit mode?  Vlad, what does this IRA option do?
Comment 5 Kenneth Hoste 2009-06-09 14:30:43 UTC
(In reply to comment #4)
> Are you running in 32bit mode? 

No, I'm not. Using either -m32 or -m64 makes no difference for lucas, and if I'm either m32 or m64, then I still obtain a 64-bit binary (when not using -fno-ira-share-spill-slots), so definitely 64-bit mode.

Comment 6 Kenneth Hoste 2009-06-09 14:51:46 UTC
Some more related details which might help shed light on the cause behind this.

The 178.galgel benchmark (again, Fortran), is also being miscompiled, but now using

-ffixed-form -fno-ira-share-spill-slots -fno-tree-loop-im 

in combination with -O3 or -Os. Note that -O1 and -O2 are working fine with these options specified. 
(-ffixed-form is always needed to compile galgel, so this one is probably of minor importance here)

On top of this, additionally specifying
 -fno-tree-dominator-opts 
resolves the issue at -O3 (corrext code is being emitted), but this is not the case at -Os, where the miscompile still occurs.
Comment 7 Zdenek Sojka 2010-04-12 13:47:32 UTC
Running check on gcc/g++ shows further miscompilations with -fno-ira-share-spill-slots (as of r158131, x86_64-linux):

gcc.c-torture/execute/20021120-1.c FAILs with:
-O2 -fno-ira-share-spill-slots
or
-O1 -foptimize-register-move -fno-ira-share-spill-slots

gcc.c-torture/execute/pr28982a.c FAILs with:
-O1 -fno-ira-share-spill-slots

gcc.dg/graphite/interchange-8.c FAILs with:
-O2 -fpeel-loops -fno-ira-share-spill-slots

gcc.c-torture/execute/regstack-1.c FAILs with:
-Os -fschedule-insns -fno-ira-share-spill-slots -fno-sched-critical-path-heuristic

There are further FAILs with more complicated compiler flags needed to reproduce, and it's also possible I haven't checked all -fno-ira-share-spill-slots miscompilations
Comment 8 Zdenek Sojka 2010-04-12 18:55:11 UTC
Created attachment 20370 [details]
execution tests that FAIL with -fno-ira-share-spill-slots

r158225, x86_64-linux, languages=c,c++,lto,fortran

$ make check RUNTESTFLAGS="--target_board=unix/-fno-ira-share-spill-slots"
$ cat gcc/testsuite/*/*.log | grep '^FAIL:' | grep 'exec' &> pr40386.txt

Without duplicates, this is the list of files that at least once fail the execution test:
c-c++-common/torture/complex-sign-mixed-div.c
gcc.c-torture/execute/builtins/pr22237.c
gcc.c-torture/execute/pr23135.c
gcc.c-torture/execute/pr28982a.c
gcc.c-torture/execute/pr28982b.c
gcc.c-torture/execute/20020508-2.c
gcc.c-torture/execute/20020508-3.c
gcc.c-torture/execute/20021120-1.c
gcc.dg/guality/inline-params.c
gcc.dg/vect/vect-strided-u8-i8-gap4.c
gcc.target/x86_64/abi/test_passing_floats.c
gfortran.dg/alloc_comp_assign_2.f90
gfortran.dg/alloc_comp_assign_3.f90
gfortran.dg/eoshift_large_1.f90
gfortran.dg/func_derived_1.f90
gfortran.dg/reshape_rank7.f90
gfortran.fortran-torture/execute/intrinsic_cshift.f90
g++.old-deja/g++.eh/ia64-1.C

There were no ICEs caused by this flag.
Comment 9 Vladimir Makarov 2010-09-08 17:44:34 UTC
The problem is in that pseudos (r121 in our case) spilled by IRA are
not added to live_throughout of reload chain.  As the result,
pseudo_forbidden_regs are not set up for such pseudos and they can get
a hard registers (42 in our case) even if they live through insns
(insn 153 in our case) using reload (0th in our case) with this
register when another pseudo is spilled and reload ask IRA to assign
the correspodning hard register to other pseudo.

Here are some parts of IRA dump:

Spilling for insn 153.
Using reg 2 for reload 1
Using reg 42 for reload 0
...
Spilling for insn 238.
Using reg 2 for reload 0
      Spill 117(a35), cost=5000
      Spilled regs 117
        Try assign 121(a6), cost=5000: reassign to 42


The fix is pretty simple.  I'll send it soon.

Comment 10 Vladimir Makarov 2010-09-09 13:43:18 UTC
Subject: Bug 40386

Author: vmakarov
Date: Thu Sep  9 13:42:51 2010
New Revision: 164095

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164095
Log:
2010-09-08  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* ira.c (pseudo_for_reload_consideration_p): Don't use
	flag_ira_share_spill_slots.

2010-09-08  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase.


Added:
    branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.c
    branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.x
Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/ira.c
    branches/gcc-4_4-branch/gcc/testsuite/ChangeLog

Comment 11 Vladimir Makarov 2010-09-09 13:47:37 UTC
Subject: Bug 40386

Author: vmakarov
Date: Thu Sep  9 13:47:14 2010
New Revision: 164097

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164097
Log:
2010-09-08  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* ira.c (pseudo_for_reload_consideration_p): Don't use
	flag_ira_share_spill_slots.

2010-09-08  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase.


Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/pr40386.x
Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/ira.c
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog

Comment 12 Vladimir Makarov 2010-09-09 13:51:49 UTC
Subject: Bug 40386

Author: vmakarov
Date: Thu Sep  9 13:51:25 2010
New Revision: 164100

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164100
Log:
2010-09-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* ira.c (pseudo_for_reload_consideration_p): Don't use
	flag_ira_share_spill_slots.

2010-09-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/40386
	* gcc.c-torture/execute/{pr40386.c,pr40386.x}: New testcase.



Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr40386.c
    trunk/gcc/testsuite/gcc.c-torture/execute/pr40386.x
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira.c
    trunk/gcc/testsuite/ChangeLog

Comment 13 Zdenek Sojka 2010-09-15 00:05:33 UTC
It seems all the testsuite failures caused by -fno-ira-share-spill-slots and gone now, good!
Comment 14 Andrew Pinski 2012-01-07 21:31:58 UTC
Fixed.