As of r162558 on trunk, and r164498 on 4.5 branch, SpecCPU2000 Ammp has failed for ARM -O3 thumb. The same patch was committed for both of these revisions, intending to fix pr45051: Index: gcc/reload1.c =================================================================== --- gcc/reload1.c (revision 164495) +++ gcc/reload1.c (revision 164498) @@ -8325,6 +8325,8 @@ delete_output_reload (rtx insn, int j, i int n_inherited = 0; rtx i1; rtx substed; + unsigned regno; + int nregs; /* It is possible that this reload has been only used to set another reload we eliminated earlier and thus deleted this instruction too. */ @@ -8376,6 +8378,12 @@ delete_output_reload (rtx insn, int j, i if (n_occurrences > n_inherited) return; + regno = REGNO (reg); + if (regno >= FIRST_PSEUDO_REGISTER) + nregs = 1; + else + nregs = hard_regno_nregs[regno][GET_MODE (reg)]; + /* If the pseudo-reg we are reloading is no longer referenced anywhere between the store into it and here, and we're within the same basic block, then the value can only @@ -8387,7 +8395,7 @@ delete_output_reload (rtx insn, int j, i if (NOTE_INSN_BASIC_BLOCK_P (i1)) return; if ((NONJUMP_INSN_P (i1) || CALL_P (i1)) - && reg_mentioned_p (reg, PATTERN (i1))) + && refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL)) { /* If this is USE in front of INSN, we only have to check that there are no more references than accounted for by inheritance. */ I assume the patch was meant to prevent deletions that shouldn't occur. This might be what happens for the original symptomatic test, but I am now seeing extra deletions that shouldn't happen for Ammp. For example, without this patch, you get these insns somewhere in the ira dump for mm_fv_update_nonbon() from rectmm.c: (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) (plus:SI (reg:SI 1 r1) (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil)) (insn 3164 3163 1730 107 rectmm.c:1041 (set (reg:SI 3 r3) (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil)) With the patch, you lose the add and just get this: (insn 3164 3161 1730 107 rectmm.c:1041 (set (reg:SI 3 r3) (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil)) The incrementing of r1 is perfectly legitimate and useful and removing it is a bug. Other increments of r9, ip, r0 and r3 are also lost. I think the issue might be that reg_mentioned_p() considers output registers to have been "mentioned", whereas the refers_to_regno_p() does not consider an output register to have been "referred to". I can see the problem with only using reg_mentioned_p() because it doesn't handle subregs, but there is also a problem with only using refers_to_regno_p(), as we can see with this segfault in Ammp. I therefore wonder if the fix might be this: Index: gcc/reload1.c =================================================================== --- gcc/reload1.c (revision 168082) +++ gcc/reload1.c (working copy) @@ -8395,7 +8395,8 @@ delete_output_reload (rtx insn, int j, i if (NOTE_INSN_BASIC_BLOCK_P (i1)) return; if ((NONJUMP_INSN_P (i1) || CALL_P (i1)) - && refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL)) + && (refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL) + || reg_mentioned_p (reg, PATTERN (i1)))) { /* If this is USE in front of INSN, we only have to check that there are no more references than accounted for by inheritance. */ It would be helpful to have some input from someone who understands reload better than I do, so we can come up with a fix that we have confidence in.
Please post a preprocessed source file and the exact compiler options that must be passed to cc1 to produce incorrect assembly. If possible, also include the two dump files from before/after things go wrong.
Created attachment 22896 [details] Preprocessed source, before/after .s file, before/after IRA dump Contains the following: pr47166/rectmm.i - the preprocessed source pr47166/CMD - how to compile it pr47166/rectmm.r164495.i.188r.ira - working IRA dump pr47166/rectmm.r164495.s - working assembly pr47166/rectmm.r164498.i.188r.ira - broken IRA dump pr47166/rectmm.r164498.s - broken assembly Note that the revision numbers represent working/broken revisions of gcc 4.5 branch.
> > pr47166/CMD - how to compile it > Sorry, I forgot to give cc1 commands: cc1 -fpreprocessed rectmm.i -quiet -dumpbase rectmm.i -mthumb -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=vfpv3-d16 -auxbase rectmm -O3 -version -fno-tree-vectorize -fdump-rtl-ira-details -o rectmm.s
(In reply to comment #0) > (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) > (plus:SI (reg:SI 1 r1) > (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil)) Please consider my request in <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>.
(In reply to comment #4) > (In reply to comment #0) > > (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1) > > (plus:SI (reg:SI 1 r1) > > (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil)) > > Please consider my request in > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>. I mean, the <But r1 is an input as well as an output , i.e. "referred to", so insn 3163 should have matched without your patch. I'm missing analysis on why that didn't happen> part.
> I mean, the <But r1 is an input as well as an output , i.e. "referred to", so > insn 3163 should have matched without your patch. I'm missing analysis on why > that didn't happen> part. OK, I will do more analysis to try to determine what's going on.
I have altered the source, so that I call both refers_to_regno_p and reg_mentioned_p and print out when the two disagree. For the attached rectmm.i input file, there are only 5 disagreements: mismatch for regno=2343: refersto=F,mentions=T reg = (reg/f:SI 2343) i1 = (insn 3162 1730 3165 99 rectmm.c:1041 (set (reg/f:SI 2343) (reg:SI 1 r1)) -1 (nil)) mismatch for regno=2355: refersto=F,mentions=T reg = (reg/f:SI 2355) i1 = (insn 3166 1745 3169 99 rectmm.c:1043 (set (reg/f:SI 2355) (reg:SI 9 r9)) -1 (nil)) mismatch for regno=2377: refersto=F,mentions=T reg = (reg/f:SI 2377) i1 = (insn 3170 1770 1731 99 rectmm.c:1045 (set (reg/f:SI 2377) (reg:SI 12 ip)) -1 (nil)) mismatch for regno=2349: refersto=F,mentions=T reg = (reg/f:SI 2349) i1 = (insn 3174 1737 3177 99 rectmm.c:1042 (set (reg/f:SI 2349) (reg:SI 0 r0)) -1 (nil)) mismatch for regno=2361: refersto=F,mentions=T - reg and i1 follow ... reg = (reg/f:SI 2361) il = (insn 3178 1752 1772 99 rectmm.c:1044 (set (reg/f:SI 2361) (reg:SI 3 r3)) -1 (nil)) This occurs because reg_mentioned_p looks at output regs, but refers_to_regno_p does not. The knock-on effect of this difference must lead to those increments being lost, but I don't know why yet.
I think the real problem here is that when reloading autoincs, we somehow end up with (gdb) p spill_reg_store[3] $42 = (rtx) 0xf7a1118c (gdb) pr (insn 3163 3161 3164 99 rectmm.c:1041 (set (reg:SI 1 r1) (plus:SI (reg:SI 1 r1) (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil)) which doesn't set R3 at all - insn 3164 does.
Created attachment 22945 [details] Test patch Could you try the following? It's a variant of a patch Richard Sandiford recently posted.
(In reply to comment #9) > Created attachment 22945 [details] > Test patch > > Could you try the following? It's a variant of a patch Richard Sandiford > recently posted. Thanks for looking into this. The patch has worked on gcc 4.5 for Ammp, but I still need to test Applu (which is failing for current version of gcc 4.5 branch) and then Ammp on trunk. I'll get back to you. It's looking promising though.
I have now confirmed Richard's patch (http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00548.html) is doing the right thing on 4.5 branch and trunk, for the tests that were previously failing. Thanks very much for debugging this, Bernd. Many thanks for your patch, Richard.
(In reply to comment #9) > Created attachment 22945 [details] > Test patch > > Could you try the following? It's a variant of a patch Richard Sandiford > recently posted. Just noticed you said this is a "variant" of Richard's patch. Hopefully the two patches can be reconciled?
Please consider reverting the reload change on the branch. I don't think we want to keep fiddling with reload there.
(In reply to comment #13) > Please consider reverting the reload change on the branch. I don't think we > want to keep fiddling with reload there. I don't see how this makes any sense. We've identified two bugs in reload, both leading to incorrect code, and you'd rather revert the fix for one than fix both?
> I don't see how this makes any sense. We've identified two bugs in reload, both > leading to incorrect code, and you'd rather revert the fix for one than fix > both? Yes, because the first one wasn't a regression and fixing the second may well cause a third to pop up, you never know with reload. This is called RM...
(In reply to comment #15) > the first one wasn't a regression and fixing the second may well > cause a third to pop up, you never know with reload. This is called RM... FWIW, you'd then be in danger of regressing 41085. I have to say it won't be very RM:y to proactively unfix two bugs just because it's reload, given two separate fixes...
> I have to say it won't be very RM:y to proactively unfix two bugs just because > it's reload, given two separate fixes... Well, the second is still not fixed as of this writing, that's why I suggested reverting the problematic first fix in the first place. As for "just because it's reload", yes, that's precisely the point; patching reload is always risky.
So am I right and this is a regression on the branch? Please fill in known-to-work and known-to-fail versions. Thx.
Correct dependencies to include PR41085 (which _was_ a 4.5 regression, unlike PR45051 which was only a regression on 4.6).
Author: bernds Date: Sun Jan 23 21:11:24 2011 New Revision: 169144 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169144 Log: PR rtl-optimization/47166 * reload1.c (emit_reload_insns): Disable the spill_reg_store mechanism for PRE_MODIFY and POST_MODIFY. (inc_for_reload): For PRE_MODIFY, return the insn that sets the reloadreg. Modified: trunk/gcc/ChangeLog trunk/gcc/reload1.c
So is this now fixed on the trunk? Can anyone run SPEC2k?
(In reply to comment #21) > So is this now fixed on the trunk? Can anyone run SPEC2k? I can run it. I will report back when done.
(In reply to comment #21) > So is this now fixed on the trunk? Can anyone run SPEC2k? Spec2K's Ammp now runs correctly for trunk, with -mthumb -O3. The rest of Spec2K is OK too (apart from mesa, which is the subject of another bug).
Fixed on the trunk then.
(In reply to comment #15) > > I don't see how this makes any sense. We've identified two bugs in reload, both > > leading to incorrect code, and you'd rather revert the fix for one than fix > > both? > > Yes, because the first one wasn't a regression and fixing the second may well > cause a third to pop up, you never know with reload. This is called RM... I am curious to know what we are doing with this ?. The patch for PR41085 caused this regression for ARM with the 4.5 branch and this fix is to fix the issue exposed by the fix for PR41085. Are we going to revert the fix for PR41085 or are we going to backport the current fix for this ? Thanks Ramana
I'll be testing the backport and committing the patch.
On Tue, Feb 1, 2011 at 1:35 AM, bernds at gcc dot gnu.org <gcc-bugzilla@gcc.gnu.org> wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47166 > > --- Comment #26 from Bernd Schmidt <bernds at gcc dot gnu.org> 2011-02-01 01:34:59 UTC --- > I'll be testing the backport and committing the patch. Thanks - I wasn't sure I knew what was happening. Ramana > > -- > Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. >
Author: bernds Date: Fri Feb 11 16:01:19 2011 New Revision: 170053 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170053 Log: PR rtl-optimization/47166 * reload1.c (emit_reload_insns): Disable the spill_reg_store mechanism for PRE_MODIFY and POST_MODIFY. (inc_for_reload): For PRE_MODIFY, return the insn that sets the reloadreg. Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/reload1.c
I've had some problems with timeouts in my test setup, but I now have runs that differ only in that pthread1.cc times out for one multilib and no longer times out for another. I took that as good enough to commit the patch.
(In reply to comment #29) > I've had some problems with timeouts in my test setup, but I now have runs that > differ only in that pthread1.cc times out for one multilib and no longer times > out for another. I took that as good enough to commit the patch. Just wanted to confirm that it is working for me with 4.5 now as well. Thanks for all your efforts, Bernd.
Author: rsandifo Date: Mon Mar 14 13:46:12 2011 New Revision: 170939 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170939 Log: gcc/testsuite/ PR rtl-optimization/47166 * gcc.c-torture/execute/postmod-1.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/postmod-1.c Modified: trunk/gcc/testsuite/ChangeLog
Author: rsandifo Date: Mon Mar 14 13:48:46 2011 New Revision: 170940 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170940 Log: gcc/testsuite/ PR rtl-optimization/47166 * gcc.c-torture/execute/postmod-1.c: New test. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c Modified: branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Author: rsandifo Date: Tue Mar 15 09:38:07 2011 New Revision: 170982 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170982 Log: gcc/testsuite/ PR rtl-optimization/47166 * gcc.c-torture/execute/postmod-1.c: New test. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c Modified: branches/gcc-4_6-branch/gcc/testsuite/ChangeLog