[Bug target/98981] gcc-10.2 for RISC-V has extraneous register moves

wilson at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Fri Feb 19 00:42:13 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98981

--- Comment #5 from Jim Wilson <wilson at gcc dot gnu.org> ---
Neither of the two patches I mentioned in comment 1 can fix the problem by
themselves, as we still have a mix of SImode and DImode operations.

I looked at REE.  It doesn't work because there is more than one reaching def. 
But even if it did work, I don't think it would completely solve the problem
because it runs after register allocation and hence won't be able to remove
move instructions.

To get the best result, we need the register allocator to take two registers
with different modes with overlapping live ranges, and realize that they can be
allocated to the same hard reg because the overlapping uses are
non-conflicting.  I haven't tried looking at the register allocator, but it
doesn't seem like a good way to try to solve the problem.

We have an inconvenient mix of SImdoe and DImode because we don't have SImode
compare and branch instructions.  That requires sign extending 32-bit values to
64-bit to compare them, which then results in the sign extend and register
allocation optimization issues.  it is unlikely that 32-bit compare and branch
instructions will be added to the ISA though.

One useful thing I noticed is that the program is doing a max operation, and
the B extension adds a max instruction.  Having one instruction instead of a
series of instructions including a branch to compute max makes the optimization
issues easier, and gcc does give the right result in this case.  Using a
compiler with B support I get
        lw      a4,0(a5)
        lw      a2,0(a3)
        addi    a5,a5,4
        addi    a3,a3,4
        addw    a4,a4,a2
        max     a0,a4,a0
        bne     a5,a1,.L2
which is good code with the extra moves and sign-extends removed.  So I have a
workaround of sorts, but only if you have the B extension.

The -mtune=sifive-7-series can support conditional move via macro fusion, I was
hopeful that this would work as well as max, but unfortunately the sign-extend
that was removed in the max case does't get removed in the conditional move
case.  Also, the conditional move is 2-address, and the register allocator ends
up needing a reload, which gives us the unwanted mv again.  So the code in this
case is the same as without the option.  I didn't check to see if this is
fixable.


More information about the Gcc-bugs mailing list