Bug 47166 - [4.5 Regression] SpecCpu2000 Ammp segfaults for ARM with -O3 -mthumb
Summary: [4.5 Regression] SpecCpu2000 Ammp segfaults for ARM with -O3 -mthumb
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.5.2
: P3 critical
Target Milestone: 4.5.3
Assignee: Not yet assigned to anyone
URL:
Keywords: patch, wrong-code
Depends on: 41085 45051
Blocks:
  Show dependency treegraph
 
Reported: 2011-01-04 11:52 UTC by Ian Bolton
Modified: 2011-03-15 09:38 UTC (History)
6 users (show)

See Also:
Host: arm-linux-gnueabi
Target: arm-linux-gnueabi
Build: arm-linux-gnueabi
Known to work: 4.5.1, 4.6.0
Known to fail: 4.5.2
Last reconfirmed: 2011-01-05 16:45:57


Attachments
Preprocessed source, before/after .s file, before/after IRA dump (671.14 KB, application/x-gzip)
2011-01-04 12:30 UTC, Ian Bolton
Details
Test patch (1003 bytes, patch)
2011-01-11 17:02 UTC, Bernd Schmidt
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ian Bolton 2011-01-04 11:52:59 UTC
As of r162558 on trunk, and r164498 on 4.5 branch, SpecCPU2000 Ammp has failed for ARM -O3 thumb.  The same patch was committed for both of these revisions, intending to fix pr45051:

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c       (revision 164495)
+++ gcc/reload1.c       (revision 164498)
@@ -8325,6 +8325,8 @@ delete_output_reload (rtx insn, int j, i
   int n_inherited = 0;
   rtx i1;
   rtx substed;
+  unsigned regno;
+  int nregs;
 
   /* It is possible that this reload has been only used to set another reload
      we eliminated earlier and thus deleted this instruction too.  */
@@ -8376,6 +8378,12 @@ delete_output_reload (rtx insn, int j, i
   if (n_occurrences > n_inherited)
     return;
 
+  regno = REGNO (reg);
+  if (regno >= FIRST_PSEUDO_REGISTER)
+    nregs = 1;
+  else
+    nregs = hard_regno_nregs[regno][GET_MODE (reg)];
+
   /* If the pseudo-reg we are reloading is no longer referenced
      anywhere between the store into it and here,
      and we're within the same basic block, then the value can only
@@ -8387,7 +8395,7 @@ delete_output_reload (rtx insn, int j, i
       if (NOTE_INSN_BASIC_BLOCK_P (i1))
        return;
       if ((NONJUMP_INSN_P (i1) || CALL_P (i1))
-         && reg_mentioned_p (reg, PATTERN (i1)))
+         && refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL))
        {
          /* If this is USE in front of INSN, we only have to check that
             there are no more references than accounted for by inheritance.  */

I assume the patch was meant to prevent deletions that shouldn't occur.  This
might be what happens for the original symptomatic test, but I am now seeing
extra deletions that shouldn't happen for Ammp.

For example, without this patch, you get these insns somewhere in the ira dump
for mm_fv_update_nonbon() from rectmm.c:

 (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) 
    (plus:SI (reg:SI 1 r1) 
       (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

 (insn 3164 3163 1730 107 rectmm.c:1041 (set (reg:SI 3 r3) 
    (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil))

With the patch, you lose the add and just get this:

 (insn 3164 3161 1730 107 rectmm.c:1041 (set (reg:SI 3 r3)
    (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil)) 

The incrementing of r1 is perfectly legitimate and useful and removing it is a
bug.  Other increments of r9, ip, r0 and r3 are also lost.

I think the issue might be that reg_mentioned_p() considers output registers to
have been "mentioned", whereas the refers_to_regno_p() does not consider an
output register to have been "referred to".  I can see the problem with only
using reg_mentioned_p() because it doesn't handle subregs, but there is also a
problem with only using refers_to_regno_p(), as we can see with this segfault
in Ammp.

I therefore wonder if the fix might be this:

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c       (revision 168082)
+++ gcc/reload1.c       (working copy)
@@ -8395,7 +8395,8 @@ delete_output_reload (rtx insn, int j, i
       if (NOTE_INSN_BASIC_BLOCK_P (i1))
        return;
       if ((NONJUMP_INSN_P (i1) || CALL_P (i1))
-         && refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL))
+         && (refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL)
+             || reg_mentioned_p (reg, PATTERN (i1))))
        {
          /* If this is USE in front of INSN, we only have to check that
             there are no more references than accounted for by inheritance. 
*/

It would be helpful to have some input from someone who understands reload better than I do, so we can come up with a fix that we have confidence in.
Comment 1 Bernd Schmidt 2011-01-04 12:03:09 UTC
Please post a preprocessed source file and the exact compiler options that must be passed to cc1 to produce incorrect assembly. If possible, also include the two dump files from before/after things go wrong.
Comment 2 Ian Bolton 2011-01-04 12:30:23 UTC
Created attachment 22896 [details]
Preprocessed source, before/after .s file, before/after IRA dump

Contains the following:

pr47166/rectmm.i - the preprocessed source

pr47166/CMD - how to compile it

pr47166/rectmm.r164495.i.188r.ira - working IRA dump
pr47166/rectmm.r164495.s - working assembly

pr47166/rectmm.r164498.i.188r.ira - broken IRA dump
pr47166/rectmm.r164498.s - broken assembly

Note that the revision numbers represent working/broken revisions of gcc 4.5 branch.
Comment 3 Ian Bolton 2011-01-04 12:32:50 UTC
> 
> pr47166/CMD - how to compile it
> 

Sorry, I forgot to give cc1 commands:

cc1 -fpreprocessed rectmm.i -quiet -dumpbase rectmm.i -mthumb -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=vfpv3-d16 -auxbase rectmm -O3 -version -fno-tree-vectorize -fdump-rtl-ira-details -o rectmm.s
Comment 4 Hans-Peter Nilsson 2011-01-05 06:10:05 UTC
(In reply to comment #0)
>  (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) 
>     (plus:SI (reg:SI 1 r1) 
>        (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

Please consider my request in <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>.
Comment 5 Hans-Peter Nilsson 2011-01-05 08:21:11 UTC
(In reply to comment #4)
> (In reply to comment #0)
> >  (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) 
> >     (plus:SI (reg:SI 1 r1) 
> >        (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))
> 
> Please consider my request in
> <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>.

I mean, the <But r1 is an input as well as an output , i.e. "referred to", so insn 3163 should have matched without your patch.  I'm missing analysis on why that didn't happen> part.
Comment 6 Ian Bolton 2011-01-05 14:38:42 UTC
> I mean, the <But r1 is an input as well as an output , i.e. "referred to", so
> insn 3163 should have matched without your patch.  I'm missing analysis on why
> that didn't happen> part.

OK, I will do more analysis to try to determine what's going on.
Comment 7 Ian Bolton 2011-01-05 16:45:57 UTC
I have altered the source, so that I call both refers_to_regno_p and reg_mentioned_p and print out when the two disagree.  For the attached rectmm.i input file, there are only 5 disagreements:

mismatch for regno=2343: refersto=F,mentions=T
reg = (reg/f:SI 2343)
i1 = (insn 3162 1730 3165 99 rectmm.c:1041 (set (reg/f:SI 2343)
        (reg:SI 1 r1)) -1 (nil))

mismatch for regno=2355: refersto=F,mentions=T
reg = (reg/f:SI 2355)
i1 = (insn 3166 1745 3169 99 rectmm.c:1043 (set (reg/f:SI 2355)
        (reg:SI 9 r9)) -1 (nil))

mismatch for regno=2377: refersto=F,mentions=T
reg = (reg/f:SI 2377)
i1 = (insn 3170 1770 1731 99 rectmm.c:1045 (set (reg/f:SI 2377)
        (reg:SI 12 ip)) -1 (nil))

mismatch for regno=2349: refersto=F,mentions=T
reg = (reg/f:SI 2349)
i1 = (insn 3174 1737 3177 99 rectmm.c:1042 (set (reg/f:SI 2349)
        (reg:SI 0 r0)) -1 (nil))

mismatch for regno=2361: refersto=F,mentions=T - reg and i1 follow ...
reg = (reg/f:SI 2361)
il = (insn 3178 1752 1772 99 rectmm.c:1044 (set (reg/f:SI 2361)
        (reg:SI 3 r3)) -1 (nil))

This occurs because reg_mentioned_p looks at output regs, but refers_to_regno_p does not.

The knock-on effect of this difference must lead to those increments being lost, but I don't know why yet.
Comment 8 Bernd Schmidt 2011-01-11 15:45:34 UTC
I think the real problem here is that when reloading autoincs, we somehow end up with 

(gdb) p spill_reg_store[3]
$42 = (rtx) 0xf7a1118c
(gdb) pr
(insn 3163 3161 3164 99 rectmm.c:1041 (set (reg:SI 1 r1)
        (plus:SI (reg:SI 1 r1)
            (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

which doesn't set R3 at all - insn 3164 does.
Comment 9 Bernd Schmidt 2011-01-11 17:02:40 UTC
Created attachment 22945 [details]
Test patch

Could you try the following? It's a variant of a patch Richard Sandiford recently posted.
Comment 10 Ian Bolton 2011-01-11 17:42:52 UTC
(In reply to comment #9)
> Created attachment 22945 [details]
> Test patch
> 
> Could you try the following? It's a variant of a patch Richard Sandiford
> recently posted.

Thanks for looking into this.

The patch has worked on gcc 4.5 for Ammp, but I still need to test Applu (which is failing for current version of gcc 4.5 branch) and then Ammp on trunk.  I'll get back to you.  It's looking promising though.
Comment 11 Ian Bolton 2011-01-12 10:36:39 UTC
I have now confirmed Richard's patch (http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00548.html) is doing the right thing on 4.5 branch and trunk, for the tests that were previously failing.

Thanks very much for debugging this, Bernd.
Many thanks for your patch, Richard.
Comment 12 Ian Bolton 2011-01-12 10:44:11 UTC
(In reply to comment #9)
> Created attachment 22945 [details]
> Test patch
> 
> Could you try the following? It's a variant of a patch Richard Sandiford
> recently posted.

Just noticed you said this is a "variant" of Richard's patch.  Hopefully the two patches can be reconciled?
Comment 13 Eric Botcazou 2011-01-15 16:06:36 UTC
Please consider reverting the reload change on the branch.  I don't think we want to keep fiddling with reload there.
Comment 14 Bernd Schmidt 2011-01-19 19:31:20 UTC
(In reply to comment #13)
> Please consider reverting the reload change on the branch.  I don't think we
> want to keep fiddling with reload there.

I don't see how this makes any sense. We've identified two bugs in reload, both leading to incorrect code, and you'd rather revert the fix for one than fix both?
Comment 15 Eric Botcazou 2011-01-19 19:44:44 UTC
> I don't see how this makes any sense. We've identified two bugs in reload, both
> leading to incorrect code, and you'd rather revert the fix for one than fix
> both?

Yes, because the first one wasn't a regression and fixing the second may well cause a third to pop up, you never know with reload.  This is called RM...
Comment 16 Hans-Peter Nilsson 2011-01-19 20:30:14 UTC
(In reply to comment #15)
>  the first one wasn't a regression and fixing the second may well
> cause a third to pop up, you never know with reload.  This is called RM...

FWIW, you'd then be in danger of regressing 41085.
I have to say it won't be very RM:y to proactively unfix two bugs just because it's reload, given two separate fixes...
Comment 17 Eric Botcazou 2011-01-19 22:45:20 UTC
> I have to say it won't be very RM:y to proactively unfix two bugs just because
> it's reload, given two separate fixes...

Well, the second is still not fixed as of this writing, that's why I suggested reverting the problematic first fix in the first place.  As for "just because it's reload", yes, that's precisely the point; patching reload is always risky.
Comment 18 Richard Biener 2011-01-20 10:51:43 UTC
So am I right and this is a regression on the branch?  Please fill in
known-to-work and known-to-fail versions.  Thx.
Comment 19 Hans-Peter Nilsson 2011-01-20 17:31:34 UTC
Correct dependencies to include PR41085 (which _was_ a 4.5 regression, unlike PR45051 which was only a regression on 4.6).
Comment 20 Bernd Schmidt 2011-01-23 21:11:27 UTC
Author: bernds
Date: Sun Jan 23 21:11:24 2011
New Revision: 169144

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169144
Log:
	PR rtl-optimization/47166
	* reload1.c (emit_reload_insns): Disable the spill_reg_store
	mechanism for PRE_MODIFY and POST_MODIFY.
	(inc_for_reload): For PRE_MODIFY, return the insn that sets the
	reloadreg.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/reload1.c
Comment 21 Jakub Jelinek 2011-01-24 14:18:47 UTC
So is this now fixed on the trunk?  Can anyone run SPEC2k?
Comment 22 Ian Bolton 2011-01-24 14:54:51 UTC
(In reply to comment #21)
> So is this now fixed on the trunk?  Can anyone run SPEC2k?

I can run it.  I will report back when done.
Comment 23 Ian Bolton 2011-01-25 13:45:59 UTC
(In reply to comment #21)
> So is this now fixed on the trunk?  Can anyone run SPEC2k?

Spec2K's Ammp now runs correctly for trunk, with -mthumb -O3.

The rest of Spec2K is OK too (apart from mesa, which is the subject of another bug).
Comment 24 Jakub Jelinek 2011-01-25 15:20:52 UTC
Fixed on the trunk then.
Comment 25 Ramana Radhakrishnan 2011-02-01 01:16:08 UTC
(In reply to comment #15)
> > I don't see how this makes any sense. We've identified two bugs in reload, both
> > leading to incorrect code, and you'd rather revert the fix for one than fix
> > both?
> 
> Yes, because the first one wasn't a regression and fixing the second may well
> cause a third to pop up, you never know with reload.  This is called RM...

I am curious to know what we are doing with this ?. 

The patch for PR41085 caused this regression for ARM with the 4.5 branch and this fix is to fix the issue exposed by the fix for PR41085. Are we going to revert the fix for PR41085 or are we going to backport the current fix for this ? 

Thanks
Ramana
Comment 26 Bernd Schmidt 2011-02-01 01:34:59 UTC
I'll be testing the backport and committing the patch.
Comment 27 Ramana Radhakrishnan 2011-02-01 01:38:32 UTC
On Tue, Feb 1, 2011 at 1:35 AM, bernds at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47166
>
> --- Comment #26 from Bernd Schmidt <bernds at gcc dot gnu.org> 2011-02-01 01:34:59 UTC ---
> I'll be testing the backport and committing the patch.

Thanks - I wasn't sure I knew what was happening.

Ramana

>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
Comment 28 Bernd Schmidt 2011-02-11 16:01:24 UTC
Author: bernds
Date: Fri Feb 11 16:01:19 2011
New Revision: 170053

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170053
Log:
	PR rtl-optimization/47166
	* reload1.c (emit_reload_insns): Disable the spill_reg_store
	mechanism for PRE_MODIFY and POST_MODIFY.
	(inc_for_reload): For PRE_MODIFY, return the insn that sets the
	reloadreg.

Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/reload1.c
Comment 29 Bernd Schmidt 2011-02-11 16:02:23 UTC
I've had some problems with timeouts in my test setup, but I now have runs that differ only in that pthread1.cc times out for one multilib and no longer times out for another. I took that as good enough to commit the patch.
Comment 30 Ian Bolton 2011-02-15 15:47:24 UTC
(In reply to comment #29)
> I've had some problems with timeouts in my test setup, but I now have runs that
> differ only in that pthread1.cc times out for one multilib and no longer times
> out for another. I took that as good enough to commit the patch.

Just wanted to confirm that it is working for me with 4.5 now as well.  Thanks for all your efforts, Bernd.
Comment 31 Richard Sandiford 2011-03-14 13:46:20 UTC
Author: rsandifo
Date: Mon Mar 14 13:46:12 2011
New Revision: 170939

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170939
Log:
gcc/testsuite/
	PR rtl-optimization/47166
	* gcc.c-torture/execute/postmod-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 32 Richard Sandiford 2011-03-14 13:48:48 UTC
Author: rsandifo
Date: Mon Mar 14 13:48:46 2011
New Revision: 170940

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170940
Log:
gcc/testsuite/
	PR rtl-optimization/47166
	* gcc.c-torture/execute/postmod-1.c: New test.

Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Comment 33 Richard Sandiford 2011-03-15 09:38:10 UTC
Author: rsandifo
Date: Tue Mar 15 09:38:07 2011
New Revision: 170982

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170982
Log:
gcc/testsuite/
	PR rtl-optimization/47166
	* gcc.c-torture/execute/postmod-1.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog