This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Work in progress: "Super Sib Calls"; opinions sought


> Rather than wade through your patch, perhaps you can explain
> _how_ you intend to attack the current problems with regular
> sibcalls?

First off, the super sib calls will not be a replacement for the current
sib call stuff, but rather an addition.  I figure that's necessary, in
case I'm running into certain portability issues during the process.

So, what I'm trying to do is offer a more general stack re-usage for calls
in tail position, even if the sib call optimisation fails.  That is, I'm
generating RTL code similar to an ordinary call for each tail call, but
I'm moving the args back into the incoming arg space, _after_ they're all
evaluated and have been mangled via the outgoing arg space.  In a way
that prevents the worrying about overlapping arguments (also see below).

And how did I proceed so far?  I have added a new sequence to the
CALL_PLACEHOLDER object that's being filled during calls.c.  Hence, a new
independent call chain will be passed on and later evaluated during
rest_of_compilation in sibcalls.c.  If, for example, the tail call con-
tains var args, then super sib calls (as well as other tail call
optimisations) fail.  However, in cases where the callee requires as much
(or less) arg space as the caller, super sib calls actually optimise the
call, _even_ if the current function has allocated huge amounts of stack
space (for locals).  Note that sibcalls currently fail in such cases!

Other restrictions, like "no indirect calls", "-fpic", etc are currently
also part of super sib calls, but I'm working on removing some of them,
too (especially indirect calls are important, even if specific platforms
like ARM may not benefit from these changes).

To sum this up a bit: my patch mainly touches calls.c and sibcalls.c.  I
did need to modify rtl.def and rtl.h to extend the placeholder; just like
I had to promote/pass the new call sequence through some other files as
well.  So, in calls.c I'm checking whether there's a super sib call
candidate and if so, emit RTL for it; and in sibcall.c I'm picking an
appropriate sequence.  The actual jump instruction and resetting of the
stack pointer (to achieve stack re-usage) is handled via the current sib
call epilogue, as this is the same for super sib calls and "ordinary" sib
calls.  On x86 it usually looks like this:

           ...
           leave
           jmp foo

No need to reinvent the wheel, I guess.

> >  - super sib calls may have a positive stack frame
> 
> This isn't a problem at present.  What _is_ currently the
> restriction is that the caller can't have local variables
> whose address is taken.

That's interesting.  In sibcall.c I found something along the lines of
this:

              /* ??? Overly conservative.  */
	      || frame_offset
              ...
	      sibcall = 0, tailrecursion = 0;

To me that means you're not allowed to have any locals at all and my test
programs seem to verify that assumption.  So, what's correct?

> >  - super sib calls must not necessarily match in function signature
> 
> This also is not a restriction at present.  The restriction is that
> the callee cannot use _more_ argument space than the caller did
> (since this space is allocated by the caller's caller).

Oops, my bad.  Super sib calls will be working just like that, for the
sake of binary compatibility.  Everything more advanced will be attacked
in a second step where I'm actually using an independent calling con-
vention.

> >  - super sib call arguments are not concerned by overlapping arguments
> 
> Better handling of this issue would be a good thing.

I think I have addressed this one already.  I use the outgoing arg space
to mangle things and store the evaluated arguments, then copy the final
results to the incoming arg space, _afterwards_.

I guess that means there won't be any overlapping args at all anymore.
My solution is simply a bit slower.

> > This is achieved by creating "almost a normal call", but right before the
> > actual call instruction would be used, I'm copying all the outgoing
> > arguments back into the incoming argument space and jump to the subroutine
> > instead.  The result is that I can reuse my incoming argument space and
> > prevent stack growth for such cases.
> 
> How is this different from "normal" sibcalls?

This is _very_ different, because the normal sibcalls don't have space on
the stack to mangle the arguments.  Instead everything is efficiently
pushed straight into the incoming arg space.  That's when you get problems
with overlapping arguments, especially when passing on your own args, etc.

I'm proposing a solution that's maybe a bit slower, but covers more cases
of tail calls.

So, thanks very much for the reply and the comments and I hope this mail
puts my approach into perspective a bit more.

Cheers,
Andi.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]