This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: forcing tail/sibling call optimization


On 01-Dec-2000, Joe Buck <jbuck@racerx.synopsys.com> wrote:
> 
> I suspect that a very clean patch that implements the desired
> functionality will help your case a lot.

OK, here's a start.

2000-01-04  Fergus Henderson  <fjh@cs.mu.oz.au>

	Allow front-ends to force tail call optimization.

	* tree.h (CALL_EXPR_FORCE_TAILCALL): New boolean field.
	* expr.c (expand_expr): use expand_call_1 rather than expand_call,
	  passing CALL_EXPR_FORCE_TAILCALL field.

	* calls.h, calls.c (expand_call_1): New.  Like expand_call,
	  but with additional boolean FORCE_TAILCALL parameter that it
	  copies to the CALL_PLACEHOLDER rtx.
	  (expand_call): call expand_call_1.

	* rtl.def (CALL_PLACEHOLDER): Add new boolean field force_tailcall.
	  integrate.c (copy_insn_list): Copy it.
	  sibcall.c (optimize_sibling_and_tail_recursive_call,
	  replace_call_placeholder) Use it.

This patch is not yet "very clean".  One issue is that the
patch adds another argument to expand_call().  But I was a bit
tentative about changing the front-end interface, so instead I added
a new function expand_call_1().  Would it be better to go ahead and
change the interface to expand_call()? 
(Alternatively, any suggestions for a better name for the new function?)

Another issue is that this patch adds calls to warning() in the
back-end, and the line numbers on those warnings don't come out
right (this might be a bug in my front-end, though -- I've haven't
made much effort to get the line number info right yet).

Another potential issue is that I had to use a whole new field in the
CALL_PLACEHOLDER rtx, rather than using a bit-field, since all
of the bit-field flags seem to be already in use.  But I don't
see any reasonable alternative to that.

I've haven't tested this much yet.

I also have some changes to add C syntax for accessing this
feature, but those are not finished yet.

Index: calls.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/calls.c,v
retrieving revision 1.172
diff -u -d -u -c -p -r1.172 calls.c
*** calls.c	2000/12/30 14:52:15	1.172
--- calls.c	2001/01/03 19:49:11
*************** expand_call (exp, target, ignore)
*** 2064,2069 ****
--- 2064,2088 ----
       rtx target;
       int ignore;
  {
+   return expand_call_1(exp, target, ignore, 0);
+ }
+ 
+ /* Generate all the code for a function call
+    and return an rtx for its value.
+    Store the value in TARGET (specified as an rtx) if convenient.
+    If the value is stored in TARGET then TARGET is returned.
+    If IGNORE is nonzero, then we ignore the value of the function call.
+    If FORCE_TAILCALL is nonzero, then we should try to do sibling call
+    or tail recursion optimization for this call, even if we can't prove
+    that the locals and parameters are no longer live at the call.  */
+ 
+ rtx
+ expand_call_1 (exp, target, ignore, force_tailcall)
+      tree exp;
+      rtx target;
+      int ignore;
+      int force_tailcall;
+ {
    /* Nonzero if we are currently expanding a call.  */
    static int currently_expanding_call = 0;
  
*************** expand_call (exp, target, ignore)
*** 3436,3445 ****
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label));
      }
    else
!     emit_insns (normal_call_insns);
  
    currently_expanding_call--;
  
--- 3455,3472 ----
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label,
! 						force_tailcall));
      }
    else
!     {
!       /* If the user told us to force last call optimization,
!          and we tried, but we couldn't, then issue a warning.  */
!       if (force_tailcall && flag_optimize_sibling_calls)
!         warning("cannot optimize tail call");
! 
!       emit_insns (normal_call_insns);
!     }
  
    currently_expanding_call--;
  
Index: expr.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.c,v
retrieving revision 1.284
diff -u -d -u -c -p -r1.284 expr.c
*** expr.c	2000/12/31 01:53:54	1.284
--- expr.c	2001/01/03 19:49:15
*************** expand_expr (exp, target, tmode, modifie
*** 7245,7251 ****
  	    return expand_builtin (exp, target, subtarget, tmode, ignore);
  	}
  
!       return expand_call (exp, target, ignore);
  
      case NON_LVALUE_EXPR:
      case NOP_EXPR:
--- 7245,7252 ----
  	    return expand_builtin (exp, target, subtarget, tmode, ignore);
  	}
  
!       return expand_call_1 (exp, target, ignore,
!       			    CALL_EXPR_FORCE_TAILCALL (exp));
  
      case NON_LVALUE_EXPR:
      case NOP_EXPR:
Index: expr.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.h,v
retrieving revision 1.73
diff -u -d -u -c -p -r1.73 expr.h
*** expr.h	2000/12/03 23:58:44	1.73
--- expr.h	2001/01/03 19:49:16
*************** extern rtx hard_function_value PARAMS ((
*** 1130,1135 ****
--- 1130,1136 ----
  extern rtx prepare_call_address	PARAMS ((rtx, tree, rtx *, int));
  
  extern rtx expand_call PARAMS ((tree, rtx, int));
+ extern rtx expand_call_1 PARAMS ((tree, rtx, int, int));
  
  extern rtx expand_shift PARAMS ((enum tree_code, enum machine_mode, rtx, tree,
  				 rtx, int));
Index: integrate.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/integrate.c,v
retrieving revision 1.123
diff -u -d -u -c -p -r1.123 integrate.c
*** integrate.c	2000/12/30 13:10:50	1.123
--- integrate.c	2001/01/03 19:49:17
*************** copy_insn_list (insns, map, static_chain
*** 1430,1435 ****
--- 1430,1436 ----
  	    {
  	      rtx sequence[3];
  	      rtx tail_label;
+ 	      int force_tailcall;
  
  	      for (i = 0; i < 3; i++)
  		{
*************** copy_insn_list (insns, map, static_chain
*** 1451,1461 ****
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
  	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode,
  							       sequence[0],
  							       sequence[1],
  							       sequence[2],
! 							       tail_label));
  	      break;
  	    }
  
--- 1452,1465 ----
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
+ 	      force_tailcall = XINT (PATTERN (insn), 4);
+ 
  	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode,
  							       sequence[0],
  							       sequence[1],
  							       sequence[2],
! 							       tail_label,
! 							       force_tailcall));
  	      break;
  	    }
  
Index: rtl.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.def,v
retrieving revision 1.43
diff -u -d -u -c -p -r1.43 rtl.def
*** rtl.def	2000/12/29 17:35:57	1.43
--- rtl.def	2001/01/03 19:49:18
*************** DEF_RTL_EXPR(CONSTANT_P_RTX, "constant_p
*** 919,938 ****
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 4 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The last operand is the tail recursion CODE_LABEL, which may be NULL if no 
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
     This method of tail-call elimination is intended to be replaced by
     tree-based optimizations once front-end conversions are complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuu", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
--- 919,944 ----
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 5 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The fourth operand is the tail recursion CODE_LABEL, which may be NULL if no
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
+    The last operand is used by front-ends that want to force tail call
+    optimization.  If the last operand is nonzero, this indicates that the
+    call should be treated as a sibling call or tail recursive call, even
+    if it does not appear to be one.  The back-end will issue a warning if
+    it cannot cannot do so.
+ 
     This method of tail-call elimination is intended to be replaced by
     tree-based optimizations once front-end conversions are complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuui", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
Index: sibcall.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sibcall.c,v
retrieving revision 1.10
diff -u -d -u -c -p -r1.10 sibcall.c
*** sibcall.c	2000/10/24 11:25:50	1.10
--- sibcall.c	2001/01/03 19:49:18
*************** replace_call_placeholder (insn, use)
*** 415,421 ****
    else if (use == sibcall_use_sibcall)
      emit_insns_before (XEXP (PATTERN (insn), 1), insn);
    else if (use == sibcall_use_normal)
!     emit_insns_before (XEXP (PATTERN (insn), 0), insn);
    else
      abort();
  
--- 415,425 ----
    else if (use == sibcall_use_sibcall)
      emit_insns_before (XEXP (PATTERN (insn), 1), insn);
    else if (use == sibcall_use_normal)
!     {
!       if (XEXP (PATTERN (insn), 4))
!         warning("cannot optimize tail call");
!       emit_insns_before (XEXP (PATTERN (insn), 0), insn);
!     }
    else
      abort();
  
*************** optimize_sibling_and_tail_recursive_call
*** 533,560 ****
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
  	  basic_block succ_block, call_block;
  	  rtx temp, hardret, softret;
  
! 	  /* We must be careful with stack slots which are live at
! 	     potential optimization sites.
  
! 	     ?!? This test is overly conservative and will be replaced.  */
! 	  if (frame_offset)
! 	    goto failure;
  
! 	  /* Taking the address of a local variable is fatal to tail
! 	     recursion if the address is used by the recursive call.  */
! 	  if (current_function_uses_addressof)
! 	    goto failure;
  
! 	  /* alloca (until we have stack slot life analysis) inhibits
! 	     sibling call optimizations, but not tail recursion.
! 	     Similarly if we use varargs or stdarg since they implicitly
! 	     may take the address of an argument.  */
!  	  if (current_function_calls_alloca
! 	      || current_function_varargs || current_function_stdarg)
! 	    sibcall = 0;
  
  	  call_block = BLOCK_FOR_INSN (insn);
  
--- 537,578 ----
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
+ 	  int force_tailcall = (XINT (PATTERN (insn), 4));
  	  basic_block succ_block, call_block;
  	  rtx temp, hardret, softret;
  
! 	  if (force_tailcall)
! 	    {
! 	      /* The front-end has marked this call to indicate
! 	         that it should be treated as a tail call.
! 	         This means that the programmer or front-end has promised
! 		 that it's OK for the compiler to deallocate this function's
! 		 parameters and local variables before making this call.
! 		 So we can safely skip the checks for taking the
! 		 address of local variables.  */
! 	    }
! 	  else 
! 	    {
! 	      /* We must be careful with stack slots which are live at
! 	         potential optimization sites.
  
! 	         ?!? This test is overly conservative and will be replaced.  */
! 	      if (frame_offset)
! 	        goto failure;
  
! 	      /* Taking the address of a local variable is fatal to tail
! 	         recursion if the address is used by the recursive call.  */
! 	      if (current_function_uses_addressof)
! 	        goto failure;
  
! 	      /* alloca (until we have stack slot life analysis) inhibits
! 	         sibling call optimizations, but not tail recursion.
! 	         Similarly if we use varargs or stdarg since they implicitly
! 	         may take the address of an argument.  */
!  	      if (current_function_calls_alloca
! 	          || current_function_varargs || current_function_stdarg)
! 	        sibcall = 0;
! 	    }
  
  	  call_block = BLOCK_FOR_INSN (insn);
  
*************** optimize_sibling_and_tail_recursive_call
*** 614,619 ****
--- 632,639 ----
  	     execute after returning from the function call.  So this call
  	     can not be optimized.  */
  failure:
+ 	  if (force_tailcall)
+ 	    warning("cannot optimize tail call");
  	  sibcall = 0, tailrecursion = 0;
  success:
  
Index: tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.215
diff -u -d -u -c -p -r1.215 tree.h
*** tree.h	2000/12/30 13:10:51	1.215
--- tree.h	2001/01/03 19:49:20
*************** struct tree_common
*** 243,248 ****
--- 243,250 ----
             TREE_PARMLIST (C++)
         SAVE_EXPR_NOPLACEHOLDER in
  	   SAVE_EXPR
+        CALL_EXPR_FORCE_TAILCALL in
+            CALL_EXPR, METHOD_CALL_EXPR
  
     asm_written_flag:
  
*************** struct tree_vec
*** 782,787 ****
--- 784,797 ----
  
  /* In a CONSTRUCTOR node.  */
  #define CONSTRUCTOR_ELTS(NODE) TREE_OPERAND (NODE, 1)
+ 
+ /* In a CALL_EXPR or METHOD_CALL_EXPR, nonzero means that this call
+    should be treated as a tail recursive call or sibling call,
+    even if there are still local variables that appear to be live.
+    The back-end will issue a warning if it can't generate a call
+    for which this flag is nonzero as a tail recursive call or
+    sibling call.  */
+ #define CALL_EXPR_FORCE_TAILCALL(NODE) ((NODE)->common.unsigned_flag)
  
  /* In ordinary expression nodes.  */
  #define TREE_OPERAND(NODE, I) (EXPR_CHECK (NODE)->exp.operands[I])


For reference, here's the change to my front-end:

Index: mercury-gcc.c
===================================================================
RCS file: /home/pgrad/fjh/fs/hg/fjh_repository/mercury/mercury-gcc.c,v
retrieving revision 1.18
diff -u -d -u -c -p -r1.18 mercury-gcc.c
cvs diff: conflicting specifications of output style
*** mercury-gcc.c	2001/01/03 17:14:16	1.18
--- mercury-gcc.c	2001/01/03 20:17:09
*************** tree
*** 414,420 ****
  merc_build_call_expr (fnptr, args, force_tailcall_p)
       tree fnptr;
       tree args;
!      int force_tailcall_p ATTRIBUTE_UNUSED;
  {
    tree fnptrtype;
    tree fntype;
--- 414,420 ----
  merc_build_call_expr (fnptr, args, force_tailcall_p)
       tree fnptr;
       tree args;
!      int force_tailcall_p;
  {
    tree fnptrtype;
    tree fntype;
*************** merc_build_call_expr (fnptr, args, force
*** 426,431 ****
--- 426,432 ----
    rettype = TREE_TYPE (fntype);
    call = build (CALL_EXPR, rettype, fnptr, args, NULL_TREE);
    TREE_SIDE_EFFECTS (call) = 1;
+   CALL_EXPR_FORCE_TAILCALL (call) = force_tailcall_p;
    return fold (call);
  }
  
-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]