Reload patch version 2

Fri Sep 4 04:11:00 GMT 1998

  In message <Pine.SOL.3.90.970821174953.733E-100000@maigret.informatik.rwth-aa
chen.de>you write:
  > Here's an updated version of my reload patch.
Just so I don't catch anyone completely off guard.  I'm actually
responding to Bernd's old message from last year.  I think there's
a few points that need to be clarified.  I'm still going through all
the old mail on this topic.

  > Problems:
  >   - the two potential code quality problems mentioned in my first mail
  >     concerning caller-saves which need a spill reg and the code in reorg.c
  >     which assumes reload regs are dead at a CODE_LABEL still exist.
The caller-saves issues are important to deal with, particularly for
floating point performance on sparcs.  If you want a way to quantify
the lossage for poor caller-saves code, that's the target to look at.

The code which assumes that reload regs are dead at CODE_LABELs can
probably just go away.  It was done primarily to help the SH.  I suspect
the improved reload code alone would totally mask any lossage in this
area.  And we can get the same from the old code by providing reorg
with real register live/dead information if it turns out to be important.

  >   - reload isn't exactly fast, but then I didn't spend any time optimizing
  >     things yet.
That's more than OK at this point.  First order of business is to make
sure we're going in the right direction from a design standpoint, then
work on the correctness of implementation.  We can address performance
issues as needed once we're on the right path to a correct implementation

  >   - there are still some cases where the code might spill too many registers.
We can work on the micro issues once we've got the macro issues dealt
with.

  >     Also, the cost calculations for the damage of spilling pseudo regs isn't
  >     very accurate.
Yes.  The spill cost analysis is going to have to be totally reworked.
I haven't actually looked at the newer patches yet, so I don't know if
you've already made efforts to solve this problem.

  >   - Some software, notably the Linux kernel on the i386, contains bogus asm
  >     statements that need a register from a single class for an input reload,
  >     but also contain a CLOBBER of that register.  According to a comment in
  >     reload.c, a CLOBBER makes a register unavailable from before the insn
  >     until after it, so the asm statements are wrong, and unfortunately the new
  >     code detects that.
That's fine :-)  Detecting this situation was one of the thing the
Linux kernel folks actually want since it'll help them identify the
asms which have problems

  >   The correct way to write them would be to add an
  >     output reload that matches the input register and reload it into a dummy
  >     register.  But sometimes this is hard to do, because the number of
  >     constraints for an asm statement is restricted by MAX_RECOG_OPERANDS.
There were some patches to crank up MAX_RECOG_OPERANDS.  I think they
got tabled at some point, but I've got no particular objection to 
raising it.  It'll require some examination of the md files to catch
things like %[letter]10 which would have a different meaning if
MAX_RECOG_OPERANDS was increased.

  >   - maybe the register life information could be used to generate correct
  >     REG_DEAD notes.
Or just run a hard reg life analysis pass after reload.   Having this
info available would deal with the recog problem mentioned above and
simplify its code.  It would also be useful for the sched2 instruction
splitter.

  >   - For every spilled pseudo, check the reloaded insns across
  > which it is live and if
  >     * the register is not actually used in any of them
  >     * there's always a free unused hard reg which the pseudo could cheaply be
  >       moved into and out of.
  >     * the total cost of generating these moves is lower than the cost of
  >       spilling the pseudo
  >     then generate the appropriate move instructions instead of spilling the
  >     pseudo.
Yea.  

  >   - Could extend the above to allocate a stack slot for the pseudo, but keep
  >     it in a register wherever possible.  This would get kind of hairy, though.
This can be handle with traditional load/store motion optimizations.  It's
basically just a redundancy elimination problem.

  >   - Also rather hairy: if a pseudo dies in an instruction, its hard reg could
  >     still be used for an output reload. Likewise, if it is set in the insn
  >     and dead before, its hard reg may be used for an input reload.
Joern may have done some work on this already.    It sounds similar
to stuff he's been working on.  Regardless I don't think it's something
we need to address in your patch right now.

jeff