Patch: Revert find_reloads part of the IRA merge

Fri Jul 9 13:28:00 GMT 2010

On Fri, 2010-07-09 at 13:09 +0200, Bernd Schmidt wrote:
> For a while now I've been struggling with a few Thumb-1 patches, because
> trying to tune the backend's code generation always seemed to produce
> some inexplicable results.  I've finally tracked the problem down to a
> change in reload.
> 
> As part of the big IRA merge, find_reloads was changed to prefer
> alternatives for which fewer alternatives require a
> SMALL_REGISTER_CLASS_P class.  (I'll repeat what I said at the time,
> changes such as this should have been posted and reviewed separately
> from IRA.)
> 
> This change causes problems for Thumb-1, because its general register
> class, LO_REGS, is CLASS_LIKELY_SPILLED_P, while the high registers
> (which are fewer in number and harder to access) aren't.  As a result,
> we get bizarre results where reload prefers alternatives using high
> registers over alternatives using low registers, or even preferring
> alternatives with fewer operands allowing registers.  Another defect is
> that the code doesn't take goal_alternative_win into account.
> 
> There are other conceivable ways of tuning this heuristic, e.g.
> computing the smallest register class size over all operands for an
> alternative, and then picking the alternative that has the highest such
> minimum.  I tested a patch for this, but thinking about it some more,
> this also has the potential for unpleasant surprises: imagine a Thumb-1
> insn with one alternative allowing "l" (LO_REGS) and another allowing
> "lk" (LO_REGS + the stack pointer); we'd always prefer the second one
> for no reason.  At this point I decided that the best thing to do for
> now would be to revert this change.
> 
> It should be possible to do better here.  The thourough, but likely
> expensive way, would be to record reloads for all alternatives with the
> same cost, then choosing the best one (requiring the least spills) in
> the caller using find_reload_regs.  A simpler way that is closer to
> Vlad's change here would be to determine if an operand is input, output,
> or both, then picking the right regset(s) from the insn_chain, looking
> if there are any registers available, and then counting the number of
> known forced spills for each alternative.
> 
> It might also be possible to achieve a similar effect by just reordering
> alternatives in a port that wishes to prevent reload from using
> likely-spilled classes if possible.
> 
> The patch below has bootstrapped on i686-linux; regression tests in
> progress.  I've also tested completely different patches for the
> problem, but I guess no one wants to know that. :-)
> 
> Typical code generation differences on Thumb-1 look like this:
> 
> -       mov     r2, #192
> -       mov     r3, r2
> -       add     r3, r3, r9
> +       mov     r3, r9
> +       add     r3, r3, #192
> 
> There's a small arm-specific bit in the patch, which switches two
> alternatives on the grounds that synthesizing a negative constant in a
> register is more expensive than reloading one register into another.
> Hence, the alternative using constants should be preferred if it can
> match, and thus it should go first:
> 
> -       mov     r5, #32
> -       neg     r5, r5
> -       add     r4, r2, r5
> +       mov     r4, r2
> +       sub     r4, r4, #32
>         bmi     .L2
> 
> This was a case where the find_reloads code produced better results for
> entirely the wrong reasons.
> 
> I'll commit the reload patch if no one objects soonish and if the ARM
> part is approved.
> 
> 
> Bernd

The ARM bit is fine.

R.