This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/RFC] PR target/15130 SH: A tail call optimization


> Excellent!  I thought that it would be not easy to get this "sibcall_usage"
> set at sibcall_epilogue.  How can it be done?

Oops.  I somehow thought the register was selected there - it is actually
selected by reload.  Which makes the current mechanism even more dangerous.
Every register is approximately equally likely to be chosen.  You can't
change scanange_reg to significantly reduce the likelyhood of a collision.

And since we are called inside a start_sequence / end_sequence nesting, and
there might be more than one sibcall site in the function, there is no good
way to find out what the actual sibcall looks like.

So, as it stands, we have to assume that all registers in
reg_class_contents[SIBCALL_REGS] are used, and also all the argument passing
registers - leaving no call-clobbered register to allocate our temporary in
for SH1..SH4.
OTOH, prior to the final adjustment, if any general purpose register was saved,
it is available.  And the final adjustment shouldn't actually need a
temporary, because the adjustment is of limited size.
So it is rather unlikely that we'll be short of a temporary.

so possibilities are

1) Do the save into macl - but we should only do this if we find no other
   register, which should only happen for sibcall epilogues, and in rare
   circumstances there at that.
   Note, this doesn't work with the TARGET_HITACHI abi.

2) When initializing SIBCALL_REGS, reserve one non-parameter passing
   register, which is not STATIC_CHAIN_REGNUM.  i.e. one of r0, r1 or r2 .
   The downside is that this limits reloads freedom to choose a register,
   and create more potential problems when the user gets creative with
   -ffixed-reg.

3) Use tentative instruction patterns for the adjustment load & add, which
   conflict with all return registers for scheduling.  In
   machine_dependent_reorg, find out which register is actually free.
   The scheduling will suck, but it doesn't affect code quality unless it
   is actually used, which is pretty much never...

4) Use a push/pop sequence, except when not using a frame pointer,
   you'll need to push one register, use that to calculate an address to
   save a second one at the bottom of the to-be-discarded frame, pop
   the saved value of the first reg into the second, store in at the bottom
   of the frame too, do the adjustment, and then pop both registers.
   Again, this is pretty grotty code - even worse than 3) - but it doesn't
   affect any other parts of the compiler, and it should be simpler to
   implement.  I.e.:
   mov r4,@r15
   mov adjust,r4
   add r15,r4
   mov r5,@-r4
   mov @r15,r5
   mov r5,@-r4
   mov r4,r15
   mov @r15+,r4
   mov @r15+,r5

Considering that it is very rare that we have a large frame, but save no
general purpose registers, I think we should go with 3) or 4) .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]