[Bug target/70989] New: [SH] Further improve utilization of zero-displacement conditional branches

Sat May 7 02:55:00 GMT 2016

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70989

            Bug ID: 70989
           Summary: [SH] Further improve utilization of zero-displacement
                    conditional branches
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
  Target Milestone: ---

Created attachment 38432
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38432&action=edit
Simplify SH abs patterns

The change in r235993 disables the delay branch for conditional branches during
the DBR pass, if the branch skips a single instruction.  This improves the
utilization of zero-displacement cbranches.

There are still some cases where other basic block optimizations would reorder
blocks in a way which makes zero-displacement cbranches impossible to use.

For example, when the abs patterns are simplified we get the following code in
CSiBE (cg_compiler_opensrc/compile.c).

Before:
.L872:
        cmp/pz  r6
        mov.l   .L1024,r13
        mov     r6,r1
        bt      0f
        neg     r6,r1
0:
        mov.l   r1,@(32,r15)

After:
.L872:
        cmp/pz  r6
        bt/s    .L1004
        mov.l   r6,@(32,r15)
        bra     .L1010
        neg     r6,r1
.L1004:
        mov.l   @(32,r15),r2

        ....
        < many blocks here >
        ...

.L1010:
        bra     .L1004
        mov.l   r1,@(32,r15)
        .align 1
.L1015:

With -fno-reorder-blocks it's a bit better (from the branching point of view):

L818:
        cmp/pz  r6
        bt/s    .L950
        mov.l   r6,@(32,r15)
        neg     r6,r1
        mov.l   r1,@(32,r15)
.L950:
        mov.l   @(32,r15),r2

And with -fno-reorder-blocks -fno-delayed-branch it's clear that it's almost
impossible to recover the zero-displacement branch at this stage anymore:

.L794:
        cmp/pz  r6
        mov.l   r6,@(32,r15)
        bt      .L926
        neg     r6,r1
        mov.l   r1,@(32,r15)
.L926:
        mov.l   @(32,r15),r2

Disallowing transformation of 1-insn cbranches in the bb-reorder pass might
lead to some improvements, but I guess that these 1-insn branches have to be
"pinned" much earlier during compilation to get better results.