This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: forcing tail/sibling call optimization


On 16-Sep-2002, Richard Henderson <rth@redhat.com> wrote:
> On Tue, Sep 17, 2002 at 12:17:55PM +1000, Fergus Henderson wrote:
> > 	* tree.h (CALL_EXPR_TAILCALL): New boolean field.
> > 	* calls.c (expand_call): Use it, and copy to the CALL_PLACEHOLDER RTX.
> > 	* rtl.def (CALL_PLACEHOLDER): Add new boolean field.
> > 	* integrate.c (copy_insn_list): Copy it.
> > 	* sibcall.c (optimize_sibling_and_tail_recursive_call): Use it.
> 
> Ok for basic-improvements-branch.

Unfortunately that patch was broken -- there was an off-by-one
error which I introduced when cleaning up the patch for posting.
The test `if (currently_expanding_call == 0)' needs to be
`if (currently_expanding_call == 1)'.  The effect of this bug
was to disable all sibling call optimization.

I did attempt to test the exact patch that I posted in the ways that
I described, but the GCC tests passed anyway (none of them test that
sibcall optimization is applied, apparently), and although I ran some
Mercury tests which would have caught the problem, there was a missing
dependency in the make file which led to me testing a version of the
Mercury compiler which had not been properly recompiled/relinked against
the modified GCC sources.

Anyway, for the record, here is a corrected version of the patch.
The only difference between the two is that I've updated the test
as described above, and I've updated the comment on the
currently_expanding_call variable declaration:

- /* Nonzero if we are currently expanding a call.  */
+ /* Counts the number of (recursive) invocations of expand_call()
+    currently active.  */

I've redone all the tests, and the results are essentially the same
as last time. (Actually better, since the GCC bug which was breaking
Mercury bootstraps seems to have been fixed, so I didn't need to
work around that.  Also the Mercury test suite failures that occurred
last time have all been fixed.)

Still OK for the gcc-3_4-basic-improvements-branch, I presume?

----------

2000-01-04  Fergus Henderson  <fjh@cs.mu.oz.au>

	Provide a way for front-ends to indicate to the back-end when
	it is safe to do a tail call.

	If the front-end marks a call as a tail-call, don't let potential
	liveness of the caller's stack inhibit tail recursion or sibling
	call optimization, and issue diagnostics if we still can't do
	one of those two optimizations.

	* tree.h (CALL_EXPR_TAILCALL): New boolean field.
	* calls.c (expand_call): Use it, and copy to the CALL_PLACEHOLDER RTX.
	* rtl.def (CALL_PLACEHOLDER): Add new boolean field.
	* integrate.c (copy_insn_list): Copy it.
	* sibcall.c (optimize_sibling_and_tail_recursive_call): Use it.

Index: gcc/gcc/calls.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/calls.c,v
retrieving revision 1.231.4.5
diff -c -p -r1.231.4.5 calls.c
*** gcc/gcc/calls.c	20 Sep 2002 01:29:06 -0000	1.231.4.5
--- gcc/gcc/calls.c	25 Sep 2002 05:23:12 -0000
*************** Software Foundation, 59 Temple Place - S
*** 35,40 ****
--- 35,42 ----
  #include "sbitmap.h"
  #include "langhooks.h"
  #include "target.h"
+ #include "diagnostic.h"
+ #include "intl.h"
  
  #if !defined FUNCTION_OK_FOR_SIBCALL
  #define FUNCTION_OK_FOR_SIBCALL(DECL) 1
*************** static sbitmap stored_args_map;
*** 149,154 ****
--- 151,160 ----
     argument list for the constructor call.  */
  int stack_arg_under_construction;
  
+ /* Counts the number of (recursive) invocations of expand_call()
+    currently active.  */
+ static int currently_expanding_call = 0;
+ 
  static int calls_function	PARAMS ((tree, int));
  static int calls_function_1	PARAMS ((tree, int));
  
*************** static int check_sibcall_argument_overla
*** 226,231 ****
--- 232,243 ----
  static int combine_pending_stack_adjustment_and_call
                                                  PARAMS ((int, struct args_size *, int));
  
+ static const char * cannot_tail_call_p	PARAMS ((bool *,
+ 					         const struct args_size *));
+ static const char * cannot_sibcall_p	PARAMS ((tree, tree, int,
+                                                  const struct args_size *,
+ 						 rtx structure_value_addr));
+ 
  #ifdef REG_PARM_STACK_SPACE
  static rtx save_fixed_argument_area	PARAMS ((int, rtx, int *, int *));
  static void restore_fixed_argument_area	PARAMS ((rtx, rtx, int, int));
*************** check_sibcall_argument_overlap (insn, ar
*** 2057,2062 ****
--- 2069,2208 ----
    return insn != NULL_RTX;
  }
  
+ /* Check if both tail recursion and sibling call optimization should be
+    inhibited for this call.  If so, return a pointer to a statically
+    allocated string containing the reason why.  Otherwise, return a
+    null pointer.
+ 
+    Also set *NOT_IN_TAIL_POSITION to true if the reason is that more code
+    is reachable after this call; for those cases, we want the diagnostic
+    to be a warning, rather than a note.  */
+ 
+ static const char *
+ cannot_tail_call_p (not_in_tail_position, args_size)
+      bool *not_in_tail_position;
+      const struct args_size *args_size;
+ {
+   /* Don't try tail calling if we're already expanding a call, as that means
+      we're an argument.  */
+ 
+   if (currently_expanding_call != 1)
+     {
+        *not_in_tail_position = true;
+        return N_("call marked as tail call is not in tail position (it is an argument to another call)");
+     }
+ 
+   /* Tail calls can make things harder to debug, and we've traditionally
+      pushed these optimizations into -O2.  */
+ 
+   if (! flag_optimize_sibling_calls)
+     return N_("can't optimize tail call: -foptimize-sibling-calls not set");
+ 
+   /* Don't try tail calling if there's cleanups, as we know there's code to
+      follow the call.  */
+ 
+   if (any_pending_cleanups (1) != 0)
+     {
+        *not_in_tail_position = true;
+        return N_("call marked as tail call is not in tail position (there are pending cleanups)");
+     }
+ 
+   /* If rtx_equal_function_value_matters is false, that means we've
+      finished with regular parsing.  Which means that some of the
+      machinery we use to generate tail-calls is no longer in place.
+      This is most often true of sjlj-exceptions, which we couldn't
+      tail-call to anyway.  */
+ 
+   if (! rtx_equal_function_value_matters)
+     return N_("can't optimize tail call emitted after regular parsing");
+ 
+   /* Don't try tail calling for variable-argument functions.
+      The current implementation won't handle those.  */
+ 
+   if (args_size->var)
+     return N_("can't optimize tail call for variable-argument function");
+ 
+   return NULL;
+ }
+ 
+ /* Check if sibling call optimization (but not tail recursion) should be
+    inhibited for this call.  If so, return a pointer to a statically
+    allocated string containing the reason why.  Otherwise, return a
+    null pointer.
+ 
+    FNDECL is the declaration of the function being called,
+    or 0 if the function is computed (not known by name).
+ 
+    FUNTYPE is the data type of the function being called.
+ 
+    FLAGS is a mask of ECF_ flags for the callee.
+ 
+    ARGS_SIZE is the total size in bytes of all the stack-parms.
+ 
+    STRUCTURE_VALUE_ADDR is the address where we should return a
+    BLKmode value, or 0 if value not BLKmode.  */
+ 
+ static const char *
+ cannot_sibcall_p (fndecl, funtype, flags, args_size, structure_value_addr)
+      tree fndecl;
+      tree funtype;
+      int flags;
+      const struct args_size *args_size;
+      rtx structure_value_addr;
+ {
+ #ifdef HAVE_sibcall_epilogue
+   bool not_supported_on_this_target = !HAVE_sibcall_epilogue;
+ #else
+   bool not_supported_on_this_target = true;
+ #endif
+   /* If the register holding the address is a callee saved
+      register, then we lose.  We have no way to prevent that,
+      so we only allow calls to named functions.  */
+   /* ??? This could be done by having the insn constraints
+      use a register class that is all call-clobbered.  Any
+      reload insns generated to fix things up would appear
+      before the sibcall_epilogue.  */
+   if (fndecl == NULL_TREE)
+     return N_("can't optimize indirect tail call");
+ 
+   /* We can't optimize calls to setjmp() or longjmp(). */
+   if (flags & (ECF_RETURNS_TWICE | ECF_LONGJMP))
+     return N_("can't optimize tail call to setjmp() or longjmp()");
+ 
+   /* We can't optimize calls to __noreturn__ functions
+      (see <http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00180.html>).  */
+   if (TREE_THIS_VOLATILE (fndecl))
+     return N_("can't optimize tail call to __noreturn__ function");
+ 
+   /* If this function requires more stack slots than the current
+      function, we cannot change it into a sibling call.  */
+   if (args_size->constant > current_function_args_size)
+     return N_("can't optimize tail call: callee arg size > caller arg size");
+ 
+   /* Doing sibling call optimization in the structure_value_addr case
+      needs some work, since structure_value_addr can be allocated on the stack.
+      It does not seem worth the effort since few optimizable
+      sibling calls will return a structure.  */
+   if (structure_value_addr != NULL_RTX)
+     return N_("can't optimize tail call: function returns structure");
+ 
+   /* If the callee pops its own arguments, then it must pop exactly
+      the same number of arguments as the current function.  */
+   if (RETURN_POPS_ARGS (fndecl, funtype, args_size->constant)
+ 	   != RETURN_POPS_ARGS (current_function_decl,
+ 				TREE_TYPE (current_function_decl),
+ 				current_function_args_size))
+     return N_("can't optimize tail call: callee arg size != caller arg size");
+ 
+   if (not_supported_on_this_target)
+     return N_("can't optimize tail calls on this target");
+ 
+   if (!FUNCTION_OK_FOR_SIBCALL (fndecl))
+     return N_("can't optimize this tail call on this target");
+ 
+   return NULL;
+ }
+ 
  /* Generate all the code for a function call
     and return an rtx for its value.
     Store the value in TARGET (specified as an rtx) if convenient.
*************** expand_call (exp, target, ignore)
*** 2069,2077 ****
       rtx target;
       int ignore;
  {
-   /* Nonzero if we are currently expanding a call.  */
-   static int currently_expanding_call = 0;
- 
    /* List of actual parameters.  */
    tree actparms = TREE_OPERAND (exp, 1);
    /* RTX for the function to be called.  */
--- 2215,2220 ----
*************** expand_call (exp, target, ignore)
*** 2082,2087 ****
--- 2225,2232 ----
    rtx normal_call_insns = NULL_RTX;
    /* Sequence of insns to perform a tail recursive "call".  */
    rtx tail_call_insns = NULL_RTX;
+   /* Nonzero if the front-end has marked this call as safe for tail calling.  */
+   int marked_as_tail_call = CALL_EXPR_TAILCALL(exp);
    /* Data type of the function.  */
    tree funtype;
    /* Declaration of the function being called,
*************** expand_call (exp, target, ignore)
*** 2091,2096 ****
--- 2236,2246 ----
    int try_tail_call = 1;
    int try_tail_recursion = 1;
    int pass;
+   /* If non-null, holds a msgid for a diagnostic to issue
+      if marked_as_tail_call is set.  */
+   const char *lose = NULL;
+   /* Set if more code remains reachable after the current call.  */
+   bool not_in_tail_position = false;
  
    /* Register in which non-BLKmode value will be returned,
       or 0 if no value or if value is BLKmode.  */
*************** expand_call (exp, target, ignore)
*** 2174,2179 ****
--- 2324,2332 ----
    /* The alignment of the stack, in bytes.  */
    HOST_WIDE_INT preferred_unit_stack_boundary;
  
+   /* Increment on entry to expand_call().  */
+   currently_expanding_call++;
+ 
    /* See if this is "nothrow" function call.  */
    if (TREE_NOTHROW (exp))
      flags |= ECF_NOTHROW;
*************** expand_call (exp, target, ignore)
*** 2406,2427 ****
  	  || (!ACCUMULATE_OUTGOING_ARGS && args_size.constant)))
      structure_value_addr = copy_to_reg (structure_value_addr);
  
!   /* Tail calls can make things harder to debug, and we're traditionally
!      pushed these optimizations into -O2.  Don't try if we're already
!      expanding a call, as that means we're an argument.  Don't try if
!      there's cleanups, as we know there's code to follow the call.
! 
!      If rtx_equal_function_value_matters is false, that means we've
!      finished with regular parsing.  Which means that some of the
!      machinery we use to generate tail-calls is no longer in place.
!      This is most often true of sjlj-exceptions, which we couldn't
!      tail-call to anyway.  */
! 
!   if (currently_expanding_call++ != 0
!       || !flag_optimize_sibling_calls
!       || !rtx_equal_function_value_matters
!       || any_pending_cleanups (1)
!       || args_size.var)
      try_tail_call = try_tail_recursion = 0;
  
    /* Tail recursion fails, when we are not dealing with recursive calls.  */
--- 2559,2568 ----
  	  || (!ACCUMULATE_OUTGOING_ARGS && args_size.constant)))
      structure_value_addr = copy_to_reg (structure_value_addr);
  
!   /* Check for various reasons why we can't do sibling call or tail recursion
!      optimization.  */
!   lose = cannot_tail_call_p (&not_in_tail_position, &args_size);
!   if (lose != NULL)
      try_tail_call = try_tail_recursion = 0;
  
    /* Tail recursion fails, when we are not dealing with recursive calls.  */
*************** expand_call (exp, target, ignore)
*** 2430,2469 ****
        || TREE_OPERAND (TREE_OPERAND (exp, 0), 0) != current_function_decl)
      try_tail_recursion = 0;
  
!   /*  Rest of purposes for tail call optimizations to fail.  */
!   if (
! #ifdef HAVE_sibcall_epilogue
!       !HAVE_sibcall_epilogue
! #else
!       1
! #endif
!       || !try_tail_call
!       /* Doing sibling call optimization needs some work, since
! 	 structure_value_addr can be allocated on the stack.
! 	 It does not seem worth the effort since few optimizable
! 	 sibling calls will return a structure.  */
!       || structure_value_addr != NULL_RTX
!       /* If the register holding the address is a callee saved
! 	 register, then we lose.  We have no way to prevent that,
! 	 so we only allow calls to named functions.  */
!       /* ??? This could be done by having the insn constraints
! 	 use a register class that is all call-clobbered.  Any
! 	 reload insns generated to fix things up would appear
! 	 before the sibcall_epilogue.  */
!       || fndecl == NULL_TREE
!       || (flags & (ECF_RETURNS_TWICE | ECF_LONGJMP))
!       || TREE_THIS_VOLATILE (fndecl)
!       || !FUNCTION_OK_FOR_SIBCALL (fndecl)
!       /* If this function requires more stack slots than the current
! 	 function, we cannot change it into a sibling call.  */
!       || args_size.constant > current_function_args_size
!       /* If the callee pops its own arguments, then it must pop exactly
! 	 the same number of arguments as the current function.  */
!       || RETURN_POPS_ARGS (fndecl, funtype, args_size.constant)
! 	 != RETURN_POPS_ARGS (current_function_decl,
! 			      TREE_TYPE (current_function_decl),
! 			      current_function_args_size))
!     try_tail_call = 0;
  
    if (try_tail_call || try_tail_recursion)
      {
--- 2571,2584 ----
        || TREE_OPERAND (TREE_OPERAND (exp, 0), 0) != current_function_decl)
      try_tail_recursion = 0;
  
!   /* Rest of purposes for sibling call optimizations to fail.  */
!   if (try_tail_call)
!     {
!       lose = cannot_sibcall_p (fndecl, funtype, flags, &args_size,
!                                structure_value_addr);
!       if (lose != NULL)
!         try_tail_call = 0;
!     }
  
    if (try_tail_call || try_tail_recursion)
      {
*************** expand_call (exp, target, ignore)
*** 2531,2537 ****
        /* Expanding one of those dangerous arguments could have added
  	 cleanups, but otherwise give it a whirl.  */
        if (any_pending_cleanups (1))
! 	try_tail_call = try_tail_recursion = 0;
      }
  
    /* Generate a tail recursion sequence when calling ourselves.  */
--- 2646,2655 ----
        /* Expanding one of those dangerous arguments could have added
  	 cleanups, but otherwise give it a whirl.  */
        if (any_pending_cleanups (1))
!         {
!           lose = N_("call marked as tail call is not in tail position (there are pending cleanups)");
! 	  try_tail_call = try_tail_recursion = 0;
! 	}
      }
  
    /* Generate a tail recursion sequence when calling ourselves.  */
*************** expand_call (exp, target, ignore)
*** 2563,2569 ****
        if (optimize_tail_recursion (actparms, get_last_insn ()))
  	{
  	  if (any_pending_cleanups (1))
! 	    try_tail_call = try_tail_recursion = 0;
  	  else
  	    tail_recursion_insns = get_insns ();
  	}
--- 2681,2690 ----
        if (optimize_tail_recursion (actparms, get_last_insn ()))
  	{
  	  if (any_pending_cleanups (1))
! 	    {
! 	      try_tail_call = try_tail_recursion = 0;
!               lose = N_("call marked as tail call is not in tail position (there are pending cleanups)");
! 	    }
  	  else
  	    tail_recursion_insns = get_insns ();
  	}
*************** expand_call (exp, target, ignore)
*** 2964,2970 ****
  		|| (pass == 0
  		    && check_sibcall_argument_overlap (before_arg,
  						       &args[i])))
! 	      sibcall_failure = 1;
  	  }
  
        /* If we have a parm that is passed in registers but not in memory
--- 3085,3094 ----
  		|| (pass == 0
  		    && check_sibcall_argument_overlap (before_arg,
  						       &args[i])))
! 	      {
! 	        sibcall_failure = 1;
! 		lose = N_("can't optimize tail call: arguments overlap");
! 	      }
  	  }
  
        /* If we have a parm that is passed in registers but not in memory
*************** expand_call (exp, target, ignore)
*** 2989,2994 ****
--- 3113,3119 ----
  		      && check_sibcall_argument_overlap (before_arg,
  							 &args[i])))
  		sibcall_failure = 1;
+ 		lose = N_("can't optimize tail call: argument registers overlap");
  	    }
  
        /* If we pushed args in forward order, perform stack alignment
*************** expand_call (exp, target, ignore)
*** 3173,3178 ****
--- 3298,3304 ----
  	      && REGNO (target) < FIRST_PSEUDO_REGISTER)
  	    target = 0;
  	  sibcall_failure = 1;
+ 	  lose = N_("call marked as tail call is not in tail position (pending cleanups)");
  	}
  
        if (TYPE_MODE (TREE_TYPE (exp)) == VOIDmode
*************** expand_call (exp, target, ignore)
*** 3219,3224 ****
--- 3345,3351 ----
  
  	  /* We can not support sibling calls for this case.  */
  	  sibcall_failure = 1;
+ 	  lose = N_("can't optimize tail call: return value has multiple non-contiguous locations");
  	}
        else if (target
  	       && GET_MODE (target) == TYPE_MODE (TREE_TYPE (exp))
*************** expand_call (exp, target, ignore)
*** 3238,3243 ****
--- 3365,3371 ----
  
  	  /* We can not support sibling calls for this case.  */
  	  sibcall_failure = 1;
+ 	  lose = N_("can't optimize tail call: return value has block mode");
  	}
        else
  	target = copy_to_reg (valreg);
*************** expand_call (exp, target, ignore)
*** 3286,3291 ****
--- 3414,3420 ----
  	  highest_outgoing_arg_in_use = initial_highest_arg_in_use;
  	  stack_usage_map = initial_stack_usage_map;
  	  sibcall_failure = 1;
+ 	  lose = N_("can't optimize tail call: size of args is variable, or this was a constructor call for a stack argument");
  	}
        else if (ACCUMULATE_OUTGOING_ARGS && pass)
  	{
*************** expand_call (exp, target, ignore)
*** 3410,3421 ****
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label));
      }
    else
!     emit_insn (normal_call_insns);
  
!   currently_expanding_call--;
  
    /* If this function returns with the stack pointer depressed, ensure
       this block saves and restores the stack pointer, show it was
--- 3539,3564 ----
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label,
! 						marked_as_tail_call));
      }
    else
!     {
!       /* Issue a diagnostic, if needed.  */
!       if (marked_as_tail_call)
!         {
! 	  /* lose should be set above.  But just in case... */
! 	  if (lose == NULL)
! 	    lose = N_("can't optimize tail call");
! 
! 	  if (not_in_tail_position)
!             warning (lose);
!           else
!             inform (lose);
!         }
  
!       emit_insn (normal_call_insns);
!     }
  
    /* If this function returns with the stack pointer depressed, ensure
       this block saves and restores the stack pointer, show it was
*************** expand_call (exp, target, ignore)
*** 3427,3432 ****
--- 3570,3578 ----
        emit_move_insn (virtual_stack_dynamic_rtx, stack_pointer_rtx);
        save_stack_pointer ();
      }
+ 
+   /* Decrement on exit.  */
+   currently_expanding_call--;
  
    return target;
  }
Index: gcc/gcc/integrate.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/integrate.c,v
retrieving revision 1.202.4.1
diff -c -p -r1.202.4.1 integrate.c
*** gcc/gcc/integrate.c	17 Sep 2002 22:58:44 -0000	1.202.4.1
--- gcc/gcc/integrate.c	25 Sep 2002 05:23:12 -0000
*************** copy_insn_list (insns, map, static_chain
*** 1551,1556 ****
--- 1551,1557 ----
  	    {
  	      rtx sequence[3];
  	      rtx tail_label;
+ 	      int marked_as_tailcall;
  
  	      for (i = 0; i < 3; i++)
  		{
*************** copy_insn_list (insns, map, static_chain
*** 1572,1582 ****
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
! 	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode,
! 							       sequence[0],
! 							       sequence[1],
! 							       sequence[2],
! 							       tail_label));
  	      break;
  	    }
  
--- 1573,1584 ----
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
! 	      marked_as_tailcall = XINT (PATTERN (insn), 4);
! 
! 	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER
! 	      			     (VOIDmode, sequence[0], sequence[1],
! 				      sequence[2], tail_label,
! 				      marked_as_tailcall));
  	      break;
  	    }
  
Index: gcc/gcc/rtl.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.def,v
retrieving revision 1.59
diff -c -p -r1.59 rtl.def
*** gcc/gcc/rtl.def	15 Aug 2002 01:03:43 -0000	1.59
--- gcc/gcc/rtl.def	25 Sep 2002 05:23:12 -0000
*************** DEF_RTL_EXPR(CONSTANT_P_RTX, "constant_p
*** 1091,1110 ****
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 4 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The last operand is the tail recursion CODE_LABEL, which may be NULL if no 
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
!    This method of tail-call elimination is intended to be replaced by
!    tree-based optimizations once front-end conversions are complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuu", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
--- 1091,1120 ----
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 5 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The fourth operand is the tail recursion CODE_LABEL, which may be NULL if no
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
!    The last operand is used by front-ends that want to mark calls as suitable
!    for tail call optimization.  If the last operand is nonzero, this indicates
!    that the call is in a tail position, and the caller's stack frame is no
!    longer live.  The back-end will report an error if the call is not in
!    a tail position.  The back-end will trust the front-end about the liveness
!    of the caller's stack frame; any use of the caller's stack frame after
!    such a call is undefined behaviour.  The back-end will report a diagnostic
!    if the call cannot be optimized as a tail call.
! 
!    The CALL_PLACEHOLDER RTL method of tail-call elimination is intended to be
!    replaced by tree-based optimizations once front-end conversions are
!    complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuui", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
Index: gcc/gcc/sibcall.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sibcall.c,v
retrieving revision 1.39.4.1
diff -c -p -r1.39.4.1 sibcall.c
*** gcc/gcc/sibcall.c	20 Sep 2002 01:29:07 -0000	1.39.4.1
--- gcc/gcc/sibcall.c	25 Sep 2002 05:23:12 -0000
*************** Software Foundation, 59 Temple Place - S
*** 32,37 ****
--- 32,39 ----
  #include "output.h"
  #include "except.h"
  #include "tree.h"
+ #include "diagnostic.h"
+ #include "toplev.h"
  
  /* In case alternate_exit_block contains copy from pseudo, to return value,
     record the pseudo here.  In such case the pseudo must be set to function
*************** optimize_sibling_and_tail_recursive_call
*** 574,579 ****
--- 576,582 ----
    rtx insn, insns;
    basic_block alternate_exit = EXIT_BLOCK_PTR;
    bool no_sibcalls_this_function = false;
+   bool no_unmarked_sibcalls_this_function = false;
    bool successful_replacement = false;
    bool replaced_call_placeholder = false;
    edge e;
*************** optimize_sibling_and_tail_recursive_call
*** 651,658 ****
      }
  
    /* If the function uses ADDRESSOF, we can't (easily) determine
!      at this point if the value will end up on the stack.  */
!   no_sibcalls_this_function |= sequence_uses_addressof (insns);
  
    /* Walk the insn chain and find any CALL_PLACEHOLDER insns.  We need to
       select one of the insn sequences attached to each CALL_PLACEHOLDER.
--- 654,663 ----
      }
  
    /* If the function uses ADDRESSOF, we can't (easily) determine
!      at this point if the value will end up on the stack.
!      In that case, sibcalls can only be allowed if the front-end
!      has explicitly marked the call as a tail call.  */
!   no_unmarked_sibcalls_this_function |= sequence_uses_addressof (insns);
  
    /* Walk the insn chain and find any CALL_PLACEHOLDER insns.  We need to
       select one of the insn sequences attached to each CALL_PLACEHOLDER.
*************** optimize_sibling_and_tail_recursive_call
*** 670,694 ****
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
  	  basic_block call_block = BLOCK_FOR_INSN (insn);
  
! 	  /* alloca (until we have stack slot life analysis) inhibits
  	     sibling call optimizations, but not tail recursion.
  	     Similarly if we use varargs or stdarg since they implicitly
! 	     may take the address of an argument.  */
! 	  if (current_function_calls_alloca || current_function_stdarg)
  	    sibcall = 0;
  
  	  /* See if there are any reasons we can't perform either sibling or
  	     tail call optimizations.  We must be careful with stack slots
! 	     which are live at potential optimization sites.  */
  	  if (no_sibcalls_this_function
- 	      /* ??? Overly conservative.  */
- 	      || frame_offset
- 	      /* Any function that calls setjmp might have longjmp called from
- 		 any called function.  ??? We really should represent this
- 		 properly in the CFG so that this needn't be special cased.  */
- 	      || current_function_calls_setjmp
  	      /* Can't if more than one successor or single successor is not
  		 exit block.  These two tests prevent tail call optimization
  		 in the presense of active exception handlers.  */
--- 675,711 ----
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
+ 	  int marked_as_tailcall = (XINT (PATTERN (insn), 4));
  	  basic_block call_block = BLOCK_FOR_INSN (insn);
  
! 	  /* alloca (until we have stack slot life analysis) usually inhibits
  	     sibling call optimizations, but not tail recursion.
  	     Similarly if we use varargs or stdarg since they implicitly
! 	     may take the address of an argument.
! 	     But these constructs don't inhibit sibling call optimization if
! 	     front-end has marked this call as a tail call (in that case,
! 	     stack liveness analysis is the front-end's reponsibility).  */
! 	  if (!marked_as_tailcall
! 	      && (current_function_calls_alloca || current_function_stdarg))
  	    sibcall = 0;
  
  	  /* See if there are any reasons we can't perform either sibling or
  	     tail call optimizations.  We must be careful with stack slots
! 	     which are live at potential optimization sites, unless
! 	     the front-end marked this call as a tail call (in that case,
! 	     stack liveness analysis is the front-end's reponsibility).  */
! 	  if (!marked_as_tailcall
! 	      && (no_unmarked_sibcalls_this_function
! 	          /* ??? Overly conservative.  */
! 	          || frame_offset
! 	          /* Any function that calls setjmp might have longjmp called
! 		     from any called function.  ??? We really should represent
! 		     this properly in the CFG so that this needn't be special
! 		     cased.  */
! 	          || current_function_calls_setjmp))
! 	    sibcall = 0, tailrecursion = 0;
! 
  	  if (no_sibcalls_this_function
  	      /* Can't if more than one successor or single successor is not
  		 exit block.  These two tests prevent tail call optimization
  		 in the presense of active exception handlers.  */
*************** optimize_sibling_and_tail_recursive_call
*** 699,705 ****
  	      /* If this call doesn't end the block, there are operations at
  		 the end of the block which we must execute after returning.  */
  	      || ! call_ends_block_p (insn, call_block->end))
! 	    sibcall = 0, tailrecursion = 0;
  
  	  /* Select a set of insns to implement the call and emit them.
  	     Tail recursion is the most efficient, so select it over
--- 716,727 ----
  	      /* If this call doesn't end the block, there are operations at
  		 the end of the block which we must execute after returning.  */
  	      || ! call_ends_block_p (insn, call_block->end))
! 	    {
! 	      sibcall = 0, tailrecursion = 0;
! 	      if (marked_as_tailcall)
! 	        /* ??? Possibly incorrect diagnostic location.  */
! 	        warning ("call marked as tail call is not in tail position");
! 	    }
  
  	  /* Select a set of insns to implement the call and emit them.
  	     Tail recursion is the most efficient, so select it over
Index: gcc/gcc/tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.349.4.6
diff -c -p -r1.349.4.6 tree.h
*** gcc/gcc/tree.h	20 Sep 2002 05:29:37 -0000	1.349.4.6
--- gcc/gcc/tree.h	25 Sep 2002 05:23:12 -0000
*************** struct tree_common GTY(())
*** 244,249 ****
--- 244,251 ----
             FUNCTION_DECL
         SAVE_EXPR_NOPLACEHOLDER in
  	   SAVE_EXPR
+        CALL_EXPR_TAILCALL in
+            CALL_EXPR, METHOD_CALL_EXPR
  
     asm_written_flag:
  
*************** struct tree_vec GTY(())
*** 845,850 ****
--- 847,863 ----
  /* In a CONSTRUCTOR node.  */
  #define CONSTRUCTOR_ELTS(NODE) TREE_OPERAND (CONSTRUCTOR_CHECK (NODE), 1)
  
+ /* In a CALL_EXPR or METHOD_CALL_EXPR, nonzero means that this call
+    should be treated as a tail recursive call or sibling call,
+    even if there are still local variables that appear to be live.
+    The front-end must only set this if the call appears in a tail
+    position, i.e. if the caller will do nothing after the call
+    except return.
+    The back-end will issue a warning if it can't generate a call
+    for which this flag is nonzero as a tail recursive call or
+    sibling call.  */
+ #define CALL_EXPR_TAILCALL(NODE) ((NODE)->common.unsigned_flag)
+   
  /* In ordinary expression nodes.  */
  #define TREE_OPERAND(NODE, I) (EXPR_CHECK (NODE)->exp.operands[I])
  #define TREE_COMPLEXITY(NODE) (EXPR_CHECK (NODE)->exp.complexity)
-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]