The problem is that not all register moves are always going to be
eliminated, even when no mode changes are involved. It might make
sense to restrict that code you quoted:
case SIMPLE_PSEUDO_REG_MOVE:
if (MODES_TIEABLE_P (GET_MODE (x), word_mode))
bitmap_set_bit (decomposable_context, regno);
break;
to the second pass though.
I've not studied that patch in detail, but I'm not sure it'll help. In
most cases, including my testcase, lowering is the correct thing to do
if NEON (or IWMMXT, perhaps) is not enabled.
Right. I think I misunderstood, sorry. I thought this regression was
for NEON only, but do you mean that adding these NEON patterns introduces
the regression for non-NEON targets as well?
When NEON is enabled, however, it may still be the right thing to do:
NEON does not provide a full set of DImode operations. The test for
subreg-only uses ought to be enough to differentiate, once the
extraneous pseudos such as the one in my testcase have been dealt
with.
OK. If/when that patches goes in, the ARM backend is going to have
to pick an rtx cost for DImode SETs. It sounds like the cost will need
to be twice an SImode move regardless of whether or not NEON is enabled.