[PATCH 0/5] Tweak IRA handling of tying and earlyclobbers

Richard Sandiford richard.sandiford@arm.com
Fri Jun 21 17:43:00 GMT 2019


Richard Sandiford <richard.sandiford@arm.com> writes:
> This series of patches tweaks the IRA handling of matched constraints
> and earlyclobbers.  The main explanations are in the individual patches.
>
> Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
>
> I also tried building at least one target per CPU directory and
> comparing the effect of the patches on the assembly output for
> gcc.c-torture, gcc.dg and g++.dg using -O2 -ftree-vectorize.  The table
> below summarises the effect on the number of lines of assembly, ignoring
> tests for which the number of lines was the same:

Forgot to say that this list excludes targets for which there were
no changes in assembly length.  (Thought I'd better say that since
the list clearly doesn't have one entry per CPU directory.)

FWIW the full list was:

    aarch64-linux-gnu aarch64_be-linux-gnu alpha-linux-gnu amdgcn-amdhsa
    arc-elf arm-linux-gnueabi arm-linux-gnueabihf avr-elf bfin-elf
    c6x-elf cr16-elf cris-elf csky-elf epiphany-elf fr30-elf
    frv-linux-gnu ft32-elf h8300-elf hppa64-hp-hpux11.23 ia64-linux-gnu
    i686-pc-linux-gnu i686-apple-darwin iq2000-elf lm32-elf m32c-elf
    m32r-elf m68k-linux-gnu mcore-elf microblaze-elf mipsel-linux-gnu
    mipsisa64-linux-gnu mmix mn10300-elf moxie-rtems msp430-elf
    nds32le-elf nios2-linux-gnu nvptx-none or1k-elf pdp11
    powerpc64-linux-gnu powerpc64le-linux-gnu powerpc-ibm-aix7.0 pru-elf
    riscv32-elf riscv64-elf rl78-elf rx-elf s390-linux-gnu
    s390x-linux-gnu sh-linux-gnu sparc-linux-gnu sparc64-linux-gnu
    sparc-wrs-vxworks spu-elf tilegx-elf tilepro-elf xstormy16-elf
    v850-elf vax-netbsdelf visium-elf x86_64-darwin x86_64-linux-gnu
    xtensa-elf

> Target                 Tests  Delta   Best  Worst Median
> ======                 =====  =====   ====  ===== ======
> alpha-linux-gnu           87   -126    -96    138     -1
> arm-linux-gnueabi         38    -37    -10      4     -1
> arm-linux-gnueabihf       38    -37    -10      4     -1
> avr-elf                   19    -64    -60     14     -1
> bfin-elf                 143    -55    -21     21     -1
> c6x-elf                   38    -32     -9     16     -1
> cris-elf                 253  -1456   -192     24     -1
> csky-elf                 101   -221    -36     26     -1
> frv-linux-gnu             11    -23     -8     -1     -1
> ft32-elf                   1     -2     -2     -2     -2
> hppa64-hp-hpux11.23       66    -24    -12     12     -1
> i686-apple-darwin         22    -45    -24     11     -1
> i686-pc-linux-gnu         18    -65    -96     40     -1
> ia64-linux-gnu             1     -4     -4     -4     -4
> m68k-linux-gnu            83     31    -70     18      1
> mcore-elf                 26   -122    -38     11     -2
> mmix                      29   -110    -25      3     -1
> mn10300-elf              399    258    -70     70      1
> msp430-elf               120   1363    -13    833      2
> pdp11                     37    -90    -92     25     -1
> powerpc-ibm-aix7.0        31    -25     -4      3     -1
> powerpc64-linux-gnu       31    -26     -2      2     -1
> powerpc64le-linux-gnu     31    -26     -2      2     -1
> pru-elf                    2      8      1      7      1
> riscv32-elf                1     -2     -2     -2     -2
> riscv64-elf                1     -2     -2     -2     -2
> rl78-elf                   6    -20    -18      9     -3
> rx-elf                   123     32    -58     30     -1
> s390-linux-gnu             7     16     -6      9      1
> s390x-linux-gnu            1     -3     -3     -3     -3
> sh-linux-gnu             475  -4696   -843     42     -1
> spu-elf                  168   -296   -114     25     -2
> visium-elf               214   -936   -183     22     -1
> x86_64-darwin             30    -25     -4      2     -1
> x86_64-linux-gnu          28    -29     -4      1     -1
>
> Of course, the number of lines is only a very rough guide to code size
> and code size is only a very rough guide to performance.  It's just
> a way of getting a feel for how invasive the change is in pracitce.
>
> As often with this kind of comparison, quite a few changes in either
> direction come from things that the RA doesn't consider, such as the
> ability to merge code after RA.
>
> The msp430-elf results are especially misleading.  The port has patterns
> like:
>
> ;; Alternatives 2 and 3 are to handle cases generated by reload.
> (define_insn "subqi3"
>   [(set (match_operand:QI           0 "nonimmediate_operand" "=rYs,  rm,  &?r, ?&r")
> 	(minus:QI (match_operand:QI 1 "general_operand"       "0,    0,    !r,  !i")
> 		  (match_operand:QI 2 "general_operand"      " riYs, rmi, rmi,   r")))]
>   ""
>   "@
>   SUB.B\t%2, %0
>   SUB%X0.B\t%2, %0
>   MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0
>   MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0"
> )
>
> The patches make more use of the first two (cheap) alternatives
> in preference to the third, but sometimes at the cost of introducing
> moves elsewhere.  Each alternative counts one line in this test,
> but the third alternative is really two instructions.
>
> (If the port does actually want us to prefer the third alternative
> over introducing moves, then I think the constraints need to be
> changed.  Using "!" heavily disparages the alternative and so
> it's reasonable for the optimisers to try hard to avoid it.
> If the alternative is actually the preferred way of handling
> untied operands then the "?" on operand 0 should be enough.)
>
> The arm-* improvements come from patterns like:
>
> (define_insn_and_split "*negdi2_insn"
>   [(set (match_operand:DI         0 "s_register_operand" "=r,&r")
> 	(neg:DI (match_operand:DI 1 "s_register_operand"  "0,r")))
>    (clobber (reg:CC CC_REGNUM))]
>   "TARGET_32BIT"
>
> The patches make IRA assign a saving of one full move to ties between
> operands 0 and 1, whereas previously it would only assign a saving
> of an eigth of a move.
>
> The other big winners (e.g. cris-*, sh-* and visium-*) have similar cases.
>
> I'll post the SVE patches that rely on and test for this later.
>
> Thanks,
> Richard



More information about the Gcc-patches mailing list