This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: forcing tail/sibling call optimization


On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> On Thu, Jan 04, 2001 at 07:24:14AM +1100, Fergus Henderson wrote:
> > 	* tree.h (CALL_EXPR_FORCE_TAILCALL): New boolean field.
> 
> This part is fine.
> 
> > 	* expr.c (expand_expr): use expand_call_1 rather than expand_call,
> > 	  passing CALL_EXPR_FORCE_TAILCALL field.
> 
> This should be completely unnecessary -- expand_call receives
> the CALL_EXPR, from which it can read CALL_EXPR_FORCE_TAILCALL.

You're right.  My mistake.

> > 	* rtl.def (CALL_PLACEHOLDER): Add new boolean field force_tailcall.
> > 	  integrate.c (copy_insn_list): Copy it.
> > 	  sibcall.c (optimize_sibling_and_tail_recursive_call,
> > 	  replace_call_placeholder) Use it.
> 
> I'm a bit confused by your intended usage here.  Are you
> intending CALL_EXPR_FORCE_TAILCALL to indicate that it is
> _possible_ to perform a tail call despite otherwise 
> potential problems with the local stack frame?  Or are you
> intending to indicate that this _will_ be a tail call.

Hmm, good question ;-)

Just the former, in general.  The front-end author can't guess at what
machine-specific or implementation-specific reasons there might be for
not doing a tail call.  All the front-end can do is to let the
back-end know that the locals are not live at this call,
and (perhaps) that the call is in a tail position.
So in general it can only be a hint.

However, in the special case where the parameter types and return type
of the callee matches the parameter types and return type of the caller,
and it the functions are not varargs functions, then the back-end
really ought to respect the hint.  Likewise, if the planned
forthcoming special "tailcall" calling convention is used, the
back-end really ought to respect the hint.  So in those cases,
it should be more than just a hint.

But since the current implementation of the gcc back-end doesn't have
the special tail call convention yet, and since it doesn't always
optimize tail calls even when the parameter and return types match and
the function is not a varargs function, the two statements in the
previous paragraph had better remain just implementation advice
("should") rather than a formal requirement ("shall"), at least in the
short term.  

In the long term it would be nice if one or the other or both could
eventually be made formal requirements.  Then front-ends for languages
like ML that really *need* tail calls could use this.

> The later usage seems more in tune with the name of the
> flag.

Good point.  How about I rename it as "CALL_EXPR_USE_TAILCALL"?
That seems more neutral.

Alternatively, it could be "CALL_EXPR_HINT_TAILCALL"
or "CALL_EXPR_TAILCALL_HINT".  That might be more
descriptive for now.  But as discussed above, in the
long term I want this flag to be more than a hint.

I suppose we could have two different flags, one for the hint, and one
for the requirement.  But those tree flag bits seem to be in fairly
short supply.  Is it worth using two flags?

> In which case the change to CALL_PLACEHOLDER is 
> not needed.  That is used to choose between alternate
> code sequences at some later date.  If we _know_ that 
> we are going to tail call, then we can simply generate
> either the tail recursion or sibling call sequence 
> directly and be done with it.

You're right here too, the change to the CALL_PLACEHOLDER rtx
is not needed.  This is true even if the flag is just a hint,
so long as the hint carries the impliciation that the call is
in a tail position, not just that the locals aren't live.  I was
thinking that the architecture- and calling-convention-specific
reasons why sibling call optimization might fail were checked
in optimize_sibling_and_tail_recursive_call().  But I see now
that we already check all those reasons in expand_call(), and
optimize_sibling_and_tail_recursive_call is just checking the
tail-ness of the call and the liveness of the locals.

> Hmm.  I do suppose that would require that inlining be
> disabled, or that it happen at the tree level.

Well, we do inlining in the Mercury compiler front-end anyway,
at the "High Level Data Structure" (annotated predicate
calculus) level, so inlining in the back-end is not critical
for us.

> But I'm not sure what sort of evil inlining would imply for 
> such functional languagaes as you are implementing.

There shouldn't be any problem, so long as you do proper
tail call handling for the resulting function after inlining.
That is, when inlining a call to a function foo that contains tail
calls to bar1, bar2, etc., if the call to foo is itself a tail
call then the inlined calls to bar1, bar2, etc. should remain
tail calls.

If the call to foo is not a tail call, then it's OK to
inline it and make the calls to bar1, bar2, etc. ordinary
calls rather than tail calls.  That can only increase
stack usage by a constant factor, I think.

Anyway, I'll post a revised diff sometime in the moderately near future.
Thanks for the review.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]