[Bug target/69461] [6 Regression] ICE in lra_set_insn_recog_data, at lra.c:964

aoliva at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Jan 28 06:28:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69461

--- Comment #6 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
Created attachment 37498
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37498&action=edit
Patch I'm testing to fix the bug

LRA wants harder than reload to avoid creating a stack slot to satisfy insn
constraints.  As a result, it creates an additional REG:TI pseudo to reload a
SUBREG:V2DF of a REG:TI, and then it tries to assign that pseudo to VSX_REGS,
which in turn causes another reload because there's no way to load a TImode
value into a VSX_REG in *mov<mode>_ppc64, and that requires another, and so on,
until the limit on reload insns is exceeded.

The first problem is that we shouldn't be creating a TImode reload for
VSX_REGS, since we can't possibly satisfy that: TImode values are not ok for
VSX_REGS.  I've adjusted in_class_p to check HARD_REGNO_MODE_OK, and that put
an end to infinite stream of reloads.

It was still a very long stream, though.  simplify_operand_subreg attempts to
turn SUBREGs of MEMs into MEMs, but it will only proceed with the
simplification if the resulting address is at least as valid as the original.  

Alas, instead of the simplification, we end up repeatedly generating reloads
copying the initial value to stack slots with growing offsets, until the offset
grows enough that the address becomes invalid, at which point the subreg
simplification is performed.  That's 2047 excess stores and loads, plus insns
that compute the stack address for each of them.

In order to fix that, I amended the test on whether to proceed with the subreg
simplification to take into account the availability of regs that can hold a
value of the intended mode in the goal class for that operand.

With that, we go from 2047 excess stores and loads to only 1.  I couldn't
figure out yet how to get rid of this one extra store and load, and the excess
stack slot, but I figured I'd share what I have, that I believe to be a solid
fix, and save the investigation on an additional LRA improvement for later.


More information about the Gcc-bugs mailing list