This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Patch: Revert find_reloads part of the IRA merge

For a while now I've been struggling with a few Thumb-1 patches, because
trying to tune the backend's code generation always seemed to produce
some inexplicable results.  I've finally tracked the problem down to a
change in reload.

As part of the big IRA merge, find_reloads was changed to prefer
alternatives for which fewer alternatives require a
SMALL_REGISTER_CLASS_P class.  (I'll repeat what I said at the time,
changes such as this should have been posted and reviewed separately
from IRA.)

This change causes problems for Thumb-1, because its general register
class, LO_REGS, is CLASS_LIKELY_SPILLED_P, while the high registers
(which are fewer in number and harder to access) aren't.  As a result,
we get bizarre results where reload prefers alternatives using high
registers over alternatives using low registers, or even preferring
alternatives with fewer operands allowing registers.  Another defect is
that the code doesn't take goal_alternative_win into account.

There are other conceivable ways of tuning this heuristic, e.g.
computing the smallest register class size over all operands for an
alternative, and then picking the alternative that has the highest such
minimum.  I tested a patch for this, but thinking about it some more,
this also has the potential for unpleasant surprises: imagine a Thumb-1
insn with one alternative allowing "l" (LO_REGS) and another allowing
"lk" (LO_REGS + the stack pointer); we'd always prefer the second one
for no reason.  At this point I decided that the best thing to do for
now would be to revert this change.

It should be possible to do better here.  The thourough, but likely
expensive way, would be to record reloads for all alternatives with the
same cost, then choosing the best one (requiring the least spills) in
the caller using find_reload_regs.  A simpler way that is closer to
Vlad's change here would be to determine if an operand is input, output,
or both, then picking the right regset(s) from the insn_chain, looking
if there are any registers available, and then counting the number of
known forced spills for each alternative.

It might also be possible to achieve a similar effect by just reordering
alternatives in a port that wishes to prevent reload from using
likely-spilled classes if possible.

The patch below has bootstrapped on i686-linux; regression tests in
progress.  I've also tested completely different patches for the
problem, but I guess no one wants to know that. :-)

Typical code generation differences on Thumb-1 look like this:

-       mov     r2, #192
-       mov     r3, r2
-       add     r3, r3, r9
+       mov     r3, r9
+       add     r3, r3, #192

There's a small arm-specific bit in the patch, which switches two
alternatives on the grounds that synthesizing a negative constant in a
register is more expensive than reloading one register into another.
Hence, the alternative using constants should be preferred if it can
match, and thus it should go first:

-       mov     r5, #32
-       neg     r5, r5
-       add     r4, r2, r5
+       mov     r4, r2
+       sub     r4, r4, #32
        bmi     .L2

This was a case where the find_reloads code produced better results for
entirely the wrong reasons.

I'll commit the reload patch if no one objects soonish and if the ARM
part is approved.


Attachment: smallclass.diff
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]