This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: proper tail recursion for gcc


Richard Henderson <rth@cygnus.com> writes:

> On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> > It's not as complete as a Scheme compiler is required to implement.
> > The Scheme standard requires sibcall optimization for *every* function
> > call followed immediately by a return from the current function.
> 
> Indeed.  And such complete tail call optimizations are impossible
> with a stack-based call frame, such as is used for C and so forth.

This is false.  General tail-call-elimination is quite compatible
with a stack-based call frame.  It can even be made compatible with
normal calling conventions, though it is easier if you have a
calling convention where the called function (callee) pops the stack.

> For Scheme one must put call frames in the heap and garbage collect them.

That is sort-of true, but has nothing to do with tail-call-elimination.
Full lexical closures and call-with-current-continuation are most
noturally implemented as you described, but there are various tricks
and hybrid implementations that people do.

> > We only seem to do the optimization when the argument lists and return
> > types are identical.
> > 
> > Consider, for instance:
> > 
> > foo(a, b)
> > {
> >   c = calculate_bar(a, b);
> >   foo_with_bar(a, b, c);
> > }
> > 
> > foo_with_bar(a, b, c)
> > {
> >   ...
> > }
> > 
> > We don't sibcall optimize this, but we could and probably should.
> 
> No, we can't.  Foo knows that it has room on the stack for
> two arguments, since that's what it was given.  Foo_with_bar
> requires three arguments, so there's no enough room to 
> perform the tail call without changing the caller to pop more
> from the stack frame.

It is possible though ugly.  You can adjust the stack frame to
make room for the extra argument.  I.e. you change:
        a, b, return_pc, fp, locals ...
to:
        a, b, c, return_pc
and then jump to foo_with_bar.
When foo_with_bar returns, it will return to foo's caller, which
will pop off b and c.  a remains on the stack, but does no harm.
When foo's caller returns, a will get popped, just as if a had
been alocated with alloca.

This will fail if foo's caller is compiled to not use a frame pointer,
since the stack pointer will not be where it expects it.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]