Add a new combine pass

Segher Boessenkool segher@kernel.crashing.org
Sat Nov 23 23:01:00 GMT 2019


Hi!

On Mon, Nov 18, 2019 at 05:55:13PM +0000, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > (It's 23:35 local time, so it's still just about stage 1. :-))
> 
> Or actually, just under 1 day after end of stage 1.  Oops.
> Could have sworn stage 1 ended on the 17th :-(  Only realised
> I'd got it wrong when catching up on Saturday's email traffic.
> 
> And inevitably, I introduced a couple of stupid mistakes while
> trying to clean the patch up for submission by that (non-)deadline.
> Here's a version that fixes an inverted overlapping memref check
> and that correctly prunes the use list for combined instructions.
> (This last one is just a compile-time saving -- the old code was
> correct, just suboptimal.)

I've build the Linux kernel with the previous version, as well as this
one.  R0 is unmodified GCC, R1 is the first patch, R2 is this one:

(I've forced --param=run-combine=6 for R1 and R2):
(Percentages are relative to R0):

                    R0        R1        R2        R1        R2
       alpha   6107088   6101088   6101088   99.902%   99.902%
         arc   4008224   4006568   4006568   99.959%   99.959%
         arm   9206728   9200936   9201000   99.937%   99.938%
       arm64  13056174  13018174  13018194   99.709%   99.709%
       armhf         0         0         0         0         0
         c6x   2337237   2337077   2337077   99.993%   99.993%
        csky   3356602         0         0         0         0
       h8300   1166996   1166776   1166776   99.981%   99.981%
        i386  11352159         0         0         0         0
        ia64  18230640  18167000  18167000   99.651%   99.651%
        m68k   3714271         0         0         0         0
  microblaze   4982749   4979945   4979945   99.944%   99.944%
        mips   8499309   8495205   8495205   99.952%   99.952%
      mips64   7042036   7039816   7039816   99.968%   99.968%
       nds32   4486663         0         0         0         0
       nios2   3680001   3679417   3679417   99.984%   99.984%
    openrisc   4226076   4225868   4225868   99.995%   99.995%
      parisc   7681895   7680063   7680063   99.976%   99.976%
    parisc64   8677077   8676581   8676581   99.994%   99.994%
     powerpc  10687611  10682199  10682199   99.949%   99.949%
   powerpc64  17671082  17658570  17658570   99.929%   99.929%
 powerpc64le  17671082  17658570  17658570   99.929%   99.929%
     riscv32   1554938   1554758   1554758   99.988%   99.988%
     riscv64   6634342   6632788   6632788   99.977%   99.977%
        s390  13049643  13014939  13014939   99.734%   99.734%
          sh   3254743         0         0         0         0
     shnommu   1632364   1632124   1632124   99.985%   99.985%
       sparc   4404993   4399593   4399593   99.877%   99.877%
     sparc64   6796711   6797491   6797491  100.011%  100.011%
      x86_64  19713174  19712817  19712817   99.998%   99.998%
      xtensa         0         0         0         0         0

0 means it didn't build.

armhf is probably my own problem, not sure yet.

xtensa starts with
/tmp/ccmJoY7l.s: Assembler messages:
/tmp/ccmJoY7l.s:407: Error: cannot represent `BFD_RELOC_8' relocation in object file
and it doesn't get better.

My powerpc64 config actually built the powerpc64le config, since the
kernel since a while looks what the host system is, for its defconfig.
Oh well, fixed now.

There are fivew new failures, with either of the combine2 patches.  And
all five are actually different (different symptoms, at least):

- csky fails on libgcc build:

/home/segher/src/gcc/libgcc/fp-bit.c: In function '__fixdfsi':
/home/segher/src/gcc/libgcc/fp-bit.c:1405:1: error: unable to generate reloads for:
 1405 | }
      | ^
(insn 199 86 87 8 (parallel [
            (set (reg:SI 101)
                (plus:SI (reg:SI 98)
                    (const_int -32 [0xffffffffffffffe0])))
            (set (reg:CC 33 c)
                (lt:CC (plus:SI (reg:SI 98)
                        (const_int -32 [0xffffffffffffffe0]))
                    (const_int 0 [0])))
        ]) "/home/segher/src/gcc/libgcc/fp-bit.c":1403:23 207 {*cskyv2_declt}
     (nil))
during RTL pass: reload

Target problem?

- i386 goes into an infinite loop compiling, or at least an hour or so...
Erm I forgot too record what it was compiling.  I did attach a GDB...  It
is something from lra_create_live_ranges.

- m68k:

/home/segher/src/kernel/fs/exec.c: In function 'copy_strings':
/home/segher/src/kernel/fs/exec.c:590:1: internal compiler error: in final_scan_insn_1, at final.c:3048
  590 | }
      | ^
0x10408307 final_scan_insn_1
        /home/segher/src/gcc/gcc/final.c:3048
0x10408383 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
        /home/segher/src/gcc/gcc/final.c:3152
0x10408797 final_1
        /home/segher/src/gcc/gcc/final.c:2020
0x104091f7 rest_of_handle_final
        /home/segher/src/gcc/gcc/final.c:4658
0x104091f7 execute
        /home/segher/src/gcc/gcc/final.c:4736

and that line is
            gcc_assert (prev_nonnote_insn (insn) == last_ignored_compare);

- nds32:

/tmp/ccC8Czca.s: Assembler messages:
/tmp/ccC8Czca.s:3144: Error: Unrecognized operand/register, lmw.bi [$fp+(-60)],[$fp],$r11,0x0.

/tmp/ccl8o20c.s: Assembler messages:
/tmp/ccl8o20c.s:2449: Error: Unrecognized operand/register, lmw.bi $r9,[$fp],[$fp+(-132)],0x0.

/tmp/ccZxjwHd.s: Assembler messages:
/tmp/ccZxjwHd.s:4776: Error: Unrecognized operand/register, lmw.bi [$fp+(-52)],[$fp],[$fp+(-56)],0x0.

/tmp/cczjOS3d.s: Assembler messages:
/tmp/cczjOS3d.s:2336: Error: Unrecognized operand/register, lmw.bi $r16,[$fp],$r7,0x0.

and more.  All lmw.bi...  target issue?

- sh (that's sh4-linux):

/home/segher/src/kernel/net/ipv4/af_inet.c: In function 'snmp_get_cpu_field':
/home/segher/src/kernel/net/ipv4/af_inet.c:1638:1: error: unable to find a register to spill in class 'R0_REGS'
 1638 | }
      | ^
/home/segher/src/kernel/net/ipv4/af_inet.c:1638:1: error: this is the insn:
(insn 18 17 19 2 (set (reg:SI 0 r0)
        (mem:SI (plus:SI (reg:SI 4 r4 [178])
                (reg:SI 6 r6 [171])) [17 *_3+0 S4 A32])) "/home/segher/src/kernel/net/ipv4/af_inet.c":1638:1 188 {movsi_i}
     (expr_list:REG_DEAD (reg:SI 4 r4 [178])
        (expr_list:REG_DEAD (reg:SI 6 r6 [171])
            (nil))))
/home/segher/src/kernel/net/ipv4/af_inet.c:1638: confused by earlier errors, bailing out


Looking at just binary size, which is a good stand-in for how many insns
it combined:

                    R2
       arm64   99.709%
        ia64   99.651%
        s390   99.734%
       sparc   99.877%
     sparc64  100.011%

(These are those that are not between 99.9% and 100.0%).

So only sparc64 regressed, and just a tiny bit (I can look at what that
is, if there is interest).  But 32-bit sparc improved, and s390, arm64,
and ia64 got actual benefit.

Again this is just code size, not analysing the actually changed code.


I did look at the powerpc64le changes.  It is almost completely load-
with-update (and store-with-update) insns that make the difference, but
there are also some dot insns.  The extra mr. are usually not a good
idea, but the extsw. are.  Sometimes this causes *more* insns in the end
(register move insns), but that is the exception.

This mr. problem is there with combine already, btw.  In the end it is
caused by this just not being something good to do on pseudos, it would
be better to do this after RA, in a peephole or similar.  OTOH it isn't
actually really important for performance either way.

Btw, does the new pass use TARGET_LEGITIMATE_COMBINED_INSN?  It probably
should.  (That would be the hook where we would probably want to prevent
generating mr. insns).


Segher



More information about the Gcc-patches mailing list