Bug 54369 - delayed-branch pass removes too many instructions
Summary: delayed-branch pass removes too many instructions
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: unknown
: P3 normal
Target Milestone: 4.6.4
Assignee: Eric Botcazou
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-24 18:17 UTC by chaoyingfu@gcc.gnu.org
Modified: 2014-02-16 10:00 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-08-29 00:00:00


Attachments
A simple test for an issue in reorg.c (809 bytes, text/plain)
2012-08-24 18:17 UTC, chaoyingfu@gcc.gnu.org
Details

Note You need to log in before you can comment on or make changes to this bug.
Description chaoyingfu@gcc.gnu.org 2012-08-24 18:17:26 UTC
Created attachment 28079 [details]
A simple test for an issue in reorg.c

Please check this example when using GCC 4.6 for mips-linux-gnu.
(I think the issue is in GCC 4.7 and later as well.  But due to the basic block ordering, I cannot reproduce this issue.)
Ex:
# /gcc46branch/build-mips/gcc/cc1plus -quiet Bug.i -o Bug.s -mips32r2 -O2
# cat Bug.s
_Z3bugP9ValueRTRK:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        lw      $2,0($4) <---------
        lw      $2,0($2) <--------- (NO RETURN IN THIS FUNCTION.)
        .set    macro
        .set    reorder
        .end    _Z3bugP9ValueRTRK
        .cfi_endproc
$LFE13:
        .size   _Z3bugP9ValueRTRK, .-_Z3bugP9ValueRTRK
        .ident  "GCC: (GNU) 4.6.4 20120823 (prerelease) [gcc-4_6-branch revision
 190613]"

  The delayed-branch pass in reorg.c is too aggressive, such that a return instruction and others are removed.  From debugging code in reorg.c: delete_jump() -> delete_computation() -> delete_related_insn(), I think we should not use delete_related_insn() that removes branches and all the following instructions.  A simple fix may be just using delete_insn() in delete_computation().
Ex: # svn diff reorg.c
Index: reorg.c
===================================================================
--- reorg.c     (revision 190636)
+++ reorg.c     (working copy)
@@ -3298,7 +3298,7 @@
       delete_prior_computation (note, insn);
     }

-  delete_related_insns (insn);
+  delete_insn (insn);
 }

 /* If all INSN does is set the pc, delete it,

Ex: (Bug.s after this fix.)
# cat Bug.s
_Z3bugP9ValueRTRK:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        lw      $2,0($4)
        lw      $2,0($2)
$L4:
        j       $31      <------------- OK
        srl     $2,$2,8  <------------- OK

        .set    macro
        .set    reorder
        .end    _Z3bugP9ValueRTRK
        .cfi_endproc
$LFE13:
        .size   _Z3bugP9ValueRTRK, .-_Z3bugP9ValueRTRK
        .ident  "GCC: (GNU) 4.6.4 20120823 (prerelease) [gcc-4_6-branch revision
 190613]"

Thanks!
Comment 1 Eric Botcazou 2012-08-29 22:42:54 UTC
delete_related_insns isn't supposed to do incorrect things though, so the issue must really be analyzed first.
Comment 2 Steve Ellcey 2012-08-31 00:30:01 UTC
I think I see where delete_related_insns is going wrong.  We call it
with a JUMP instruction that we want to remove because we are just
jumping to the next instruction (a label) which we would get to anyway
without the jump, then after delete_related_insns removes the jump it
checks the label it was jumping to, and if it finds no uses it calls
delete_related_insns with that label.  When delete_related_insns
is called with a label, it assumes it can remove all the code after
that label as unreachable.  But the code after the label can be reached
by the default fall-through code sequence.  So after removing a jump and
finding a label with no uses we should call delete_insn with that label,
not delete_related_insns.

So my proposed fix would be:

diff --git a/gcc/jump.c b/gcc/jump.c
index 9721fe1..fa3d65a 100644
--- a/gcc/jump.c
+++ b/gcc/jump.c
@@ -1207,7 +1207,7 @@ delete_related_insns (rtx insn)
        /* This can delete NEXT or PREV,
           either directly if NEXT is JUMP_LABEL (INSN),
           or indirectly through more levels of jumps.  */
-       delete_related_insns (lab);
+       delete_insn (lab);
       else if (tablejump_p (insn, NULL, &lab_next))
        {
          /* If we're deleting the tablejump, delete the dispatch table.
Comment 3 chaoyingfu@gcc.gnu.org 2012-08-31 00:38:39 UTC
The fall-through path of this jump has a barrier (resulted from __builtin_unreachable), and the target of this jump has a return instruction.

The algorithm in reorg.c may be more conservative to keep this jump for code correctness.  Just another solution to this issue.  Thanks!
Comment 4 Eric Botcazou 2012-08-31 14:20:03 UTC
> I think I see where delete_related_insns is going wrong.  We call it
> with a JUMP instruction that we want to remove because we are just
> jumping to the next instruction (a label) which we would get to anyway
> without the jump, then after delete_related_insns removes the jump it
> checks the label it was jumping to, and if it finds no uses it calls
> delete_related_insns with that label.  When delete_related_insns
> is called with a label, it assumes it can remove all the code after
> that label as unreachable.

Only if there is a barrier just before the label though; now this barrier should have been removed one level earlier.

The root cause of the problem is that MIPS runs its own version of the dbr pass and doesn't make sure that barriers are correctly placed for it, unlike other targets like SPARC.  So the fix is:

Index: config/mips/mips.c
===================================================================
--- config/mips/mips.c  (revision 184886)
+++ config/mips/mips.c  (working copy)
@@ -15083,7 +15083,10 @@ mips_reorg (void)
     }
 
   if (optimize > 0 && flag_delayed_branch)
-    dbr_schedule (get_insns ());
+    {
+      cleanup_barriers ();
+      dbr_schedule (get_insns ());
+    }
   mips_reorg_process_insns ();
   if (!TARGET_MIPS16
       && TARGET_EXPLICIT_RELOCS
Comment 5 Steve Ellcey 2012-08-31 18:16:08 UTC
Thanks for digging into this Eric.  I tested your patch here against the example and on the GCC testsuite and didn't see any problems.  Are you going to check this in on the mainline?
Comment 6 Eric Botcazou 2012-08-31 18:55:11 UTC
> Thanks for digging into this Eric.  I tested your patch here against the
> example and on the GCC testsuite and didn't see any problems.

You're welcome.  Thanks for the testing.

> Are you going to check this in on the mainline?

Not without Richard's approval, so I've CCed him now.
Comment 7 rdsandiford@googlemail.com 2012-09-01 08:16:23 UTC
"ebotcazou at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> writes:
>> Are you going to check this in on the mainline?
>
> Not without Richard's approval, so I've CCed him now.

Looks good, thanks.  Please go ahead.

Just out of interest, you said:

> The root cause of the problem is that MIPS runs its own version of the dbr pass
> and doesn't make sure that barriers are correctly placed for it, unlike other
> targets like SPARC.

But how does SPARC do it?  sparc_reorg starts:

  /* The only erratum we handle for now is that of the AT697F processor.  */
  if (!sparc_fix_at697f)
    return;

  /* We need to have the (essentially) final form of the insn stream in order
     to properly detect the various hazards.  Run delay slot scheduling.  */
  if (optimize > 0 && flag_delayed_branch)
    dbr_schedule (get_insns ());

without any explicit call to cleanup_barriers (which like you say usually
runs after machine_reorg but before delay_slots).
Comment 8 Eric Botcazou 2012-09-01 09:21:19 UTC
> Looks good, thanks.  Please go ahead.

Thanks.  On which branch(es)?

> But how does SPARC do it?  sparc_reorg starts:
> 
>   /* The only erratum we handle for now is that of the AT697F processor.  */
>   if (!sparc_fix_at697f)
>     return;
> 
>   /* We need to have the (essentially) final form of the insn stream in order
>      to properly detect the various hazards.  Run delay slot scheduling.  */
>   if (optimize > 0 && flag_delayed_branch)
>     dbr_schedule (get_insns ());
> 
> without any explicit call to cleanup_barriers (which like you say usually
> runs after machine_reorg but before delay_slots).

Gosh.  I didn't remember that sparc_reorg existed...  I was referring to the regular pass pipeline, which is almost always used for SPARC.  So I found the bug in the MIPS port and you found it in the SPARC port.  Time to swap? ;-)
Comment 9 rdsandiford@googlemail.com 2012-09-01 09:41:08 UTC
"ebotcazou at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> writes:
>> Looks good, thanks.  Please go ahead.
>
> Thanks.  On which branch(es)?

Hmm, looks like this goes all the way back to 4.5, so as any as
you've got the stamina for.  It should be a low-risk change.

Richard
Comment 10 Eric Botcazou 2012-09-02 10:36:31 UTC
Author: ebotcazou
Date: Sun Sep  2 10:36:27 2012
New Revision: 190858

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190858
Log:
	PR rtl-optimization/54369
	* config/mips/mips.c (mips_reorg): Invoke cleanup_barriers before
	calling dbr_schedule.
	* config/sparc/sparc.c (sparc_reorg): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/mips/mips.c
    trunk/gcc/config/sparc/sparc.c
Comment 11 Eric Botcazou 2012-09-02 10:37:00 UTC
Author: ebotcazou
Date: Sun Sep  2 10:36:54 2012
New Revision: 190859

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190859
Log:
	PR rtl-optimization/54369
	* config/mips/mips.c (mips_reorg): Invoke cleanup_barriers before
	calling dbr_schedule.
	* config/sparc/sparc.c (sparc_reorg): Likewise.

Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/config/mips/mips.c
    branches/gcc-4_7-branch/gcc/config/sparc/sparc.c
Comment 12 Eric Botcazou 2012-09-02 10:37:52 UTC
Author: ebotcazou
Date: Sun Sep  2 10:37:49 2012
New Revision: 190860

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190860
Log:
	PR rtl-optimization/54369
	* config/mips/mips.c (mips_reorg): Invoke cleanup_barriers before
	calling dbr_schedule.
	* config/sparc/sparc.c (sparc_reorg): Likewise.

Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/mips/mips.c
    branches/gcc-4_6-branch/gcc/config/sparc/sparc.c
Comment 13 Eric Botcazou 2012-09-02 10:47:46 UTC
On all active branches.
Comment 14 Jackie Rosen 2014-02-16 10:00:28 UTC Comment hidden (spam)