hard register reload patch

Tue Nov 30 23:59:00 GMT 1999

On Tue, 30 Nov 1999, Jeffrey A Law wrote:

> Second note, I'd like Bernd to comment on this code

Some comments at the top which aren't directly related to this patch.
I believe the central problem we need to solve before considering the parts
of this patch related to hard register reloading is how to treat hard
register references on ports that define SMALL_REGISTER_CLASSES.  Currently,
this issue is very muddy.

I suggest we keep the rules simple and easy to understand: the only insns
which reference hard registers may be moves which copy them into/out of
pseudos.  Hard registers may not live across any other insns.
To me, this appears to be what was originally intended, but it's not how all
parts of the compiler behave these days.  [I have an unreviewed combiner
patch (~ 10 months old) which tries to replace the (convoluted) logic dealing
with hard regs and SMALL_REGISTER_CLASSES with code that just enforces these
rules.  This patch fixes two compiler aborts when compiling the Linux kernel
with -mregparm on the i386.]

>   In message < 199911300108.BAA20462@phal.cygnus.co.uk >you write:
>   > As discussed before, there are sometimes cases where a hard register is
>   > mentioned in an insn, yet it needs to be available for reloads, e.g.
>   > the explicit use is an input and we need the same register for an output
>   > reload, or vice versa.
> Under precisely what circumstances does this happen?  The only case I'm aware
> of right now could be considered a bug in combine and or the dependency
> analysis to set up LOG_LINKs.

The SH port explicitly sets this up in some patterns.  (Some of the following
is partly guesswork, since I haven't yet seen the testcase that led to this
patch).  Take a pattern like this:

(define_insn "fix_truncdfsi2_i"
  [(set (reg:SI 22)
        (fix:SI (match_operand:DF 0 "arith_reg_operand" "f")))
   (use (match_operand:PSI 1 "fpscr_operand" "c"))]
   "TARGET_SH4"
   "ftrc %0,fpul"
  [(set_attr "type" "dfp_conv")])

If I recall Joerns explanation correctly, the problem is that the input
operand may need reg 22 (fpul) for a secondary reload.  IMO, the pattern
should be changed to use a match_operand with a single register class for
the output operand.  That should enable reload to work correctly; we'd get
two reloads requiring reg 22 which don't conflict.
Joern believes that this approach would cause unacceptable loss of code
quality, which is possible, but I haven't seen any evidence yet to support
this.

There was a discussion about this a month or two ago (before someone tries to
find it in the mail archives: that was on the cygnus local gcc list).  I sent
a patch against sh.md plus some analysis in which I tried to show that code
quality is the same on average.  (Note that my patch wasn't tested for
correctness; as I said I'm lacking a testcase.  It may have been completely
bogus).

> I'm not going to accept or reject the patch right now -- I want to get a
> better understanding of why we need to solve this problem in reload.
> 
> Thus, my comments below apply if and only if we decide that solving this
> problem in reload is the right thing to do.

I believe that, regardless of what we decide to do about the problems
described above, there are at least some parts of the patch which are going
in the right direction and make sense.
It's a start to fix one of the remaining problems that make reload the mess
that it is today: reload registers are allocated both in reload1.c
(choose_reload_regs, find_reload_regs, and all the rest) and in reload.c
(all the places that set reload_reg_rtx).  The latter know about the fact
that certain hard regs live only through parts of the insn and make decisions
based on that knowledge, while the former ignore this issue.  It would be
nice to have only one place in reload that allocates reload registers.
Joerns patch is a step in that direction (combine_reloads goes away).

>   > You can tell a hard reload from an ordinary one because it always has the
>   > reg_rtx field set, and it is equal to the in or out field, or to both.
> ?!? Can't we tell by the reload type?  Or maybe it would be better to have a
> flag which indicates that this is a hard reload.

Absolutely.  IMO, relying on special combinations of values is brittle and
non-obvious.  I think it would be much cleaner to have a flag.  If we know
that at the moment this flag is equivalent to some combination of the other
values, then we can test this with assertions.
That reminds me, we have some code in reload at the moment which does
something very similar:

   /* [ ... ]  We got an
      AUTO_INC reload if reload_out is set but reload_out_reg isn't.  */

This kind of thing is just horrible, IMO.  It's made worse by the fact that
reload_out_reg isn't documented and I have a feeling that the documentation
for reload_in_reg isn't totally accurate either.  Joern, if you fully
understand the way these two fields are used, it would be great if you could
add some documentation.

> I sincerely hope you documented this in the code.  One of my biggest issues
> with the reload pass is the precise lifetimes of reloads are not documented.
> I'd hate to see us continue that terrible habit.

I'd like to see all the RELOAD_FOR_xxx codes go away.  I've got half-done
patches to scan the rtl for replacements and thus build a definition/use
chain for reloads which should enable us to determine conflicts much more
precisely and hopefully with simpler code as well.  But I'll leave that
discussion for another occasion 8)

>   > This patch is also expected to make reload a bit faster for large programs;
>   > instead of using regsets live_before and live_after, which were used
>   > in tandem all the time, I use live_throughout and dead_or_set.
>   > In a typical instruction, only few registers die or are set, so the latter
>   > regset should be pretty quick to process - when it has to be processed;
>   > often just processing live_throughout does just fine.
> This seems pretty reasonable.  It is probably worth breaking this out into a
> separate patch since it appears to be an independent improvement.

Right.  I haven't really tried understanding the patch yet because too many
parts of it seem to do something unrelated - e.g. I don't see how hard
register reloading could influence caller-saves, so I'm not sure what those
parts of the patch are doing.

Bernd