This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Get rid of stack trampolines for nested functions


Hi,

this patch implements generic support for the elimination of stack trampolines 
and, consequently, of the need to make the stack executable when pointers to 
nested functions are used.  That's done on a per-language and per-target basis 
(i.e. there is 1 language hook and 1 target hook to parameterize it) and there 
are no changes whatsoever in code generation if both are not turned on (and 
the patch implements a -ftrampolines option to let the user override them).

The idea is based on the fact that, for targets using function descriptors as 
per their ABI like IA-64, AIX or VMS platforms, stack trampolines "degenerate" 
into descriptors built at run time on the stack and thus made up of data only, 
which in turn means that the stack doesn't need to be made executable.

This descriptor-based scheme is implemented generically for nested functions, 
i.e. the nested function lowering pass builds generic descriptors instead of 
trampolines on the stack when encountering pointers to nested functions, which 
means that there are 2 kinds of pointers to functions and therefore a run-time 
identification mechanism is needed for indirect calls to distinguish them.

Because of that, enabling the support breaks binary compatibility (for code 
manipulating pointers to nested functions).  That's OK for Ada and nested 
functions are first-class citizens in the language anyway so we really need 
this, but not for C so for example Ada doesn't use it at the interface with C 
(when objects have "convention C" in Ada parlance).

This was bootstrapped/regtested on x86_64-suse-linux but AdaCore has been 
using it on native platforms (Linux, Windows, Solaris, etc) for years.

OK for the mainline?


2016-06-29  Eric Botcazou  <ebotcazou@adacore.com>

	PR ada/37139
	PR ada/67205
	* common.opt (-ftrampolines): New option.
	* doc/invoke.texi (Code Gen Options): Document it.
	* doc/tm.texi.in (Trampolines): Add TARGET_CUSTOM_FUNCTION_DESCRIPTORS
	* doc/tm.texi: Regenerate.
	* builtins.def: Add init_descriptor and adjust_descriptor.
	* builtins.c (expand_builtin_init_trampoline): Do not issue a warning
	on platforms with descriptors.
	(expand_builtin_init_descriptor): New function.
	(expand_builtin_adjust_descriptor): Likewise.
	(expand_builtin) <BUILT_IN_INIT_DESCRIPTOR>: New case.
	<BUILT_IN_ADJUST_DESCRIPTOR>: Likewise.
	* calls.c (prepare_call_address): Remove SIBCALLP parameter and add
	FLAGS parameter.  Deal with indirect calls by descriptor and adjust.
	Set STATIC_CHAIN_REG_P on the static chain register, if any.
	(call_expr_flags): Set ECF_BY_DESCRIPTOR for calls by descriptor.
	(expand_call): Likewise.  Move around call to prepare_call_address
	and pass all flags to it.
	* cfgexpand.c (expand_call_stmt): Reinstate CALL_EXPR_BY_DESCRIPTOR.
	* gimple.h (enum gf_mask): New GF_CALL_BY_DESCRIPTOR value.
	(gimple_call_set_by_descriptor): New setter.
	(gimple_call_by_descriptor_p): New getter.
	* gimple.c (gimple_build_call_from_tree): Set CALL_EXPR_BY_DESCRIPTOR.
	(gimple_call_flags): Deal with GF_CALL_BY_DESCRIPTOR.
	* langhooks.h (struct lang_hooks): Add custom_function_descriptors.
	* langhooks-def.h (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS): Define.
	(LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS.
	* rtl.h (STATIC_CHAIN_REG_P): New macro.
	* rtlanal.c (find_first_parameter_load): Skip static chain registers.
	* target.def (custom_function_descriptors): New POD hook.
	* tree.h (FUNC_ADDR_BY_DESCRIPTOR): New flag on ADDR_EXPR.
	(CALL_EXPR_BY_DESCRIPTOR): New flag on CALL_EXPR.
	* tree-core.h (ECF_BY_DESCRIPTOR): New mask.
	Document FUNC_ADDR_BY_DESCRIPTOR and CALL_EXPR_BY_DESCRIPTOR.
	* tree.c (make_node_stat) <tcc_declaration>: Set function alignment to
	DEFAULT_FUNCTION_ALIGNMENT instead of FUNCTION_BOUNDARY.
	(build_common_builtin_nodes): Initialize init_descriptor and
	adjust_descriptor.
	* tree-nested.c: Include target.h.
	(struct nesting_info): Add 'any_descr_created' field.
	(get_descriptor_type): New function.
	(lookup_element_for_decl): New function extracted from...
	(create_field_for_decl): Likewise.
	(lookup_tramp_for_decl): ...here.  Adjust.
	(lookup_descr_for_decl): New function.
	(convert_tramp_reference_op): Deal with descriptors.
	(build_init_call_stmt): New function extracted from...
	(finalize_nesting_tree_1): ...here.  Adjust and deal with descriptors.
	* defaults.h (DEFAULT_FUNCTION_ALIGNMENT): Define.
	(TRAMPOLINE_ALIGNMENT): Set to above instead of FUNCTION_BOUNDARY.
	* config/aarch64/aarch64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Define
	* config/alpha/alpha.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
	* config/arm/arm.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
	* config/arm/arm.c (arm_function_ok_for_sibcall): Return false for an
	indirect call by descriptor if all the argument registers are used.
	* config/i386/i386.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Define.
	* config/ia64/ia64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
	* config/mips/mips.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
	* config/pa/pa.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
	* config/rs6000/rs6000.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Likewise
	* config/sparc/sparc.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
ada/
	* gcc-interface/misc.c (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS):Define
	* gcc-interface/trans.c (Attribute_to_gnu) <Attr_Access>: Deal with
	a zero  TARGET_CUSTOM_FUNCTION_DESCRIPTORSspecially for 'Code_Address.
	Otherwise, if TARGET_CUSTOM_FUNCTION_DESCRIPTORS is positive, set
	FUNC_ADDR_BY_DESCRIPTOR for 'Access/'Unrestricted_Access of nested
	subprograms if the type can use an internal representation.
	(call_to_gnu): Likewise, but set CALL_EXPR_BY_DESCRIPTOR on indirect
	calls if the type can use an internal representation.


2016-06-29  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/trampoline3.adb: New test.
	* gnat.dg/trampoline4.adb: Likewise.

-- 
Eric Botcazou
Index: ada/gcc-interface/misc.c
===================================================================
--- ada/gcc-interface/misc.c	(revision 237848)
+++ ada/gcc-interface/misc.c	(working copy)
@@ -1416,6 +1416,8 @@ get_lang_specific (tree node)
 #define LANG_HOOKS_EH_PERSONALITY	gnat_eh_personality
 #undef  LANG_HOOKS_DEEP_UNSHARING
 #define LANG_HOOKS_DEEP_UNSHARING	true
+#undef  LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS
+#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS true
 
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 
Index: ada/gcc-interface/trans.c
===================================================================
--- ada/gcc-interface/trans.c	(revision 237850)
+++ ada/gcc-interface/trans.c	(working copy)
@@ -1702,6 +1702,17 @@ Attribute_to_gnu (Node_Id gnat_node, tre
 
 	  if (TREE_CODE (gnu_expr) == ADDR_EXPR)
 	    TREE_NO_TRAMPOLINE (gnu_expr) = TREE_CONSTANT (gnu_expr) = 1;
+
+	  /* On targets for which function symbols denote a descriptor, the
+	     code address is stored within the first slot of the descriptor
+	     so we do an additional dereference:
+	       result = *((result_type *) result)
+	     where we expect result to be of some pointer type already.  */
+	  if (targetm.calls.custom_function_descriptors == 0)
+	    gnu_result
+	      = build_unary_op (INDIRECT_REF, NULL_TREE,
+				convert (build_pointer_type (gnu_result_type),
+					 gnu_result));
 	}
 
       /* For 'Access, issue an error message if the prefix is a C++ method
@@ -1728,10 +1739,19 @@ Attribute_to_gnu (Node_Id gnat_node, tre
 	      /* Also check the inlining status.  */
 	      check_inlining_for_nested_subprog (TREE_OPERAND (gnu_expr, 0));
 
-	      /* Check that we're not violating the No_Implicit_Dynamic_Code
-		 restriction.  Be conservative if we don't know anything
-		 about the trampoline strategy for the target.  */
-	      Check_Implicit_Dynamic_Code_Allowed (gnat_node);
+	      /* Moreover, for 'Access or 'Unrestricted_Access with non-
+		 foreign-compatible representation, mark the ADDR_EXPR so
+		 that we can build a descriptor instead of a trampoline.  */
+	      if ((attribute == Attr_Access
+		   || attribute == Attr_Unrestricted_Access)
+		  && targetm.calls.custom_function_descriptors > 0
+		  && Can_Use_Internal_Rep (Etype (gnat_node)))
+		FUNC_ADDR_BY_DESCRIPTOR (gnu_expr) = 1;
+
+	      /* Otherwise, we need to check that we are not violating the
+		 No_Implicit_Dynamic_Code restriction.  */
+	      else if (targetm.calls.custom_function_descriptors != 0)
+	        Check_Implicit_Dynamic_Code_Allowed (gnat_node);
 	    }
 	}
       break;
@@ -4228,6 +4248,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
   tree gnu_after_list = NULL_TREE;
   tree gnu_retval = NULL_TREE;
   tree gnu_call, gnu_result;
+  bool by_descriptor = false;
   bool went_into_elab_proc = false;
   bool pushed_binding_level = false;
   Entity_Id gnat_formal;
@@ -4267,7 +4288,15 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
      type the access type is pointing to.  Otherwise, get the formals from the
      entity being called.  */
   if (Nkind (Name (gnat_node)) == N_Explicit_Dereference)
-    gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node)));
+    {
+      gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node)));
+
+      /* If the access type doesn't require foreign-compatible representation,
+	 be prepared for descriptors.  */
+      if (targetm.calls.custom_function_descriptors > 0
+	  && Can_Use_Internal_Rep (Etype (Prefix (Name (gnat_node)))))
+	by_descriptor = true;
+    }
   else if (Nkind (Name (gnat_node)) == N_Attribute_Reference)
     /* Assume here that this must be 'Elab_Body or 'Elab_Spec.  */
     gnat_formal = Empty;
@@ -4668,6 +4697,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
 
   gnu_call
     = build_call_vec (gnu_result_type, gnu_subprog_addr, gnu_actual_vec);
+  CALL_EXPR_BY_DESCRIPTOR (gnu_call) = by_descriptor;
   set_expr_location_from_node (gnu_call, gnat_node);
 
   /* If we have created a temporary for the return value, initialize it.  */
Index: builtins.c
===================================================================
--- builtins.c	(revision 237789)
+++ builtins.c	(working copy)
@@ -4621,8 +4621,9 @@ expand_builtin_init_trampoline (tree exp
     {
       trampolines_created = 1;
 
-      warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines,
-		  "trampoline generated for nested function %qD", t_func);
+      if (targetm.calls.custom_function_descriptors != 0)
+	warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines,
+		    "trampoline generated for nested function %qD", t_func);
     }
 
   return const0_rtx;
@@ -4644,6 +4645,57 @@ expand_builtin_adjust_trampoline (tree e
   return tramp;
 }
 
+/* Expand a call to the builtin descriptor initialization routine.
+   A descriptor is made up of a couple of pointers to the static
+   chain and the code entry in this order.  */
+
+static rtx
+expand_builtin_init_descriptor (tree exp)
+{
+  tree t_descr, t_func, t_chain;
+  rtx m_descr, r_descr, r_func, r_chain;
+
+  if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, POINTER_TYPE,
+			 VOID_TYPE))
+    return NULL_RTX;
+
+  t_descr = CALL_EXPR_ARG (exp, 0);
+  t_func = CALL_EXPR_ARG (exp, 1);
+  t_chain = CALL_EXPR_ARG (exp, 2);
+
+  r_descr = expand_normal (t_descr);
+  m_descr = gen_rtx_MEM (BLKmode, r_descr);
+  MEM_NOTRAP_P (m_descr) = 1;
+
+  r_func = expand_normal (t_func);
+  r_chain = expand_normal (t_chain);
+
+  /* Generate insns to initialize the descriptor.  */
+  emit_move_insn (adjust_address_nv (m_descr, Pmode, 0), r_chain);
+  emit_move_insn (adjust_address_nv (m_descr, Pmode, UNITS_PER_WORD), r_func);
+
+  return const0_rtx;
+}
+
+/* Expand a call to the builtin descriptor adjustment routine.  */
+
+static rtx
+expand_builtin_adjust_descriptor (tree exp)
+{
+  rtx tramp;
+
+  if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE))
+    return NULL_RTX;
+
+  tramp = expand_normal (CALL_EXPR_ARG (exp, 0));
+
+  /* Unalign the descriptor to allow runtime identification.  */
+  tramp
+    = plus_constant (Pmode, tramp, targetm.calls.custom_function_descriptors);
+
+  return force_operand (tramp, NULL_RTX);
+}
+
 /* Expand the call EXP to the built-in signbit, signbitf or signbitl
    function.  The function first checks whether the back end provides
    an insn to implement signbit for the respective mode.  If not, it
@@ -6221,6 +6273,11 @@ expand_builtin (tree exp, rtx target, rt
     case BUILT_IN_ADJUST_TRAMPOLINE:
       return expand_builtin_adjust_trampoline (exp);
 
+    case BUILT_IN_INIT_DESCRIPTOR:
+      return expand_builtin_init_descriptor (exp);
+    case BUILT_IN_ADJUST_DESCRIPTOR:
+      return expand_builtin_adjust_descriptor (exp);
+
     case BUILT_IN_FORK:
     case BUILT_IN_EXECL:
     case BUILT_IN_EXECV:
Index: builtins.def
===================================================================
--- builtins.def	(revision 237789)
+++ builtins.def	(working copy)
@@ -856,6 +856,8 @@ DEF_C99_BUILTIN        (BUILT_IN__EXIT2,
 DEF_BUILTIN_STUB (BUILT_IN_INIT_TRAMPOLINE, "__builtin_init_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_HEAP_TRAMPOLINE, "__builtin_init_heap_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
+DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
+DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
 
 /* Implementing __builtin_setjmp.  */
Index: calls.c
===================================================================
--- calls.c	(revision 237789)
+++ calls.c	(working copy)
@@ -183,18 +183,73 @@ static void restore_fixed_argument_area
 
 rtx
 prepare_call_address (tree fndecl_or_type, rtx funexp, rtx static_chain_value,
-		      rtx *call_fusage, int reg_parm_seen, int sibcallp)
+		      rtx *call_fusage, int reg_parm_seen, int flags)
 {
   /* Make a valid memory address and copy constants through pseudo-regs,
      but not for a constant address if -fno-function-cse.  */
   if (GET_CODE (funexp) != SYMBOL_REF)
-    /* If we are using registers for parameters, force the
-       function address into a register now.  */
-    funexp = ((reg_parm_seen
-	       && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
-	      ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
-	      : memory_address (FUNCTION_MODE, funexp));
-  else if (! sibcallp)
+    {
+      /* If it's an indirect call by descriptor, generate code to perform
+	 runtime identification of the pointer and load the descriptor.  */
+      if ((flags & ECF_BY_DESCRIPTOR) && !flag_trampolines)
+	{
+	  const int bit_val = targetm.calls.custom_function_descriptors;
+	  rtx call_lab = gen_label_rtx ();
+
+	  gcc_assert (fndecl_or_type && TYPE_P (fndecl_or_type));
+	  fndecl_or_type
+	    = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE,
+			  fndecl_or_type);
+	  DECL_STATIC_CHAIN (fndecl_or_type) = 1;
+	  rtx chain = targetm.calls.static_chain (fndecl_or_type, false);
+
+	  /* Avoid long live ranges around function calls.  */
+	  funexp = copy_to_mode_reg (Pmode, funexp);
+
+	  if (REG_P (chain))
+	    emit_insn (gen_rtx_CLOBBER (VOIDmode, chain));
+
+	  /* Emit the runtime identification pattern.  */
+	  rtx mask = gen_rtx_AND (Pmode, funexp, GEN_INT (bit_val));
+	  emit_cmp_and_jump_insns (mask, const0_rtx, EQ, NULL_RTX, Pmode, 1,
+				   call_lab);
+
+	  /* Statically predict the branch to very likely taken.  */
+	  rtx_insn *insn = get_last_insn ();
+	  if (JUMP_P (insn))
+	    predict_insn_def (insn, PRED_BUILTIN_EXPECT, TAKEN);
+
+	  /* Load the descriptor.  */
+	  rtx mem = gen_rtx_MEM (Pmode,
+				 plus_constant (Pmode, funexp, - bit_val));
+	  MEM_NOTRAP_P (mem) = 1;
+	  emit_move_insn (chain, mem);
+	  mem = gen_rtx_MEM (Pmode,
+			     plus_constant (Pmode, funexp,
+					    UNITS_PER_WORD - bit_val));
+	  MEM_NOTRAP_P (mem) = 1;
+	  emit_move_insn (funexp, mem);
+
+	  emit_label (call_lab);
+
+	  if (REG_P (chain))
+	    {
+	      use_reg (call_fusage, chain);
+	      STATIC_CHAIN_REG_P (chain) = 1;
+	    }
+
+	  /* Make sure we're not going to be overwritten below.  */
+	  gcc_assert (!static_chain_value);
+	}
+
+      /* If we are using registers for parameters, force the
+	 function address into a register now.  */
+      funexp = ((reg_parm_seen
+		 && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
+		 ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
+		 : memory_address (FUNCTION_MODE, funexp));
+    }
+  else if (!(flags & ECF_SIBCALL))
     {
       if (!NO_FUNCTION_CSE && optimize && ! flag_no_function_cse)
 	funexp = force_reg (Pmode, funexp);
@@ -211,7 +266,10 @@ prepare_call_address (tree fndecl_or_typ
 
       emit_move_insn (chain, static_chain_value);
       if (REG_P (chain))
-	use_reg (call_fusage, chain);
+	{
+	  use_reg (call_fusage, chain);
+	  STATIC_CHAIN_REG_P (chain) = 1;
+	}
     }
 
   return funexp;
@@ -792,11 +850,13 @@ call_expr_flags (const_tree t)
     flags = internal_fn_flags (CALL_EXPR_IFN (t));
   else
     {
-      t = TREE_TYPE (CALL_EXPR_FN (t));
-      if (t && TREE_CODE (t) == POINTER_TYPE)
-	flags = flags_from_decl_or_type (TREE_TYPE (t));
+      tree type = TREE_TYPE (CALL_EXPR_FN (t));
+      if (type && TREE_CODE (type) == POINTER_TYPE)
+	flags = flags_from_decl_or_type (TREE_TYPE (type));
       else
 	flags = 0;
+      if (CALL_EXPR_BY_DESCRIPTOR (t))
+	flags |= ECF_BY_DESCRIPTOR;
     }
 
   return flags;
@@ -2633,6 +2693,8 @@ expand_call (tree exp, rtx target, int i
     {
       fntype = TREE_TYPE (TREE_TYPE (addr));
       flags |= flags_from_decl_or_type (fntype);
+      if (CALL_EXPR_BY_DESCRIPTOR (exp))
+	flags |= ECF_BY_DESCRIPTOR;
     }
   rettype = TREE_TYPE (exp);
 
@@ -3344,6 +3406,13 @@ expand_call (tree exp, rtx target, int i
       if (STRICT_ALIGNMENT)
 	store_unaligned_arguments_into_pseudos (args, num_actuals);
 
+      /* Prepare the address of the call.  This must be done before any
+	 register parameters is loaded for find_first_parameter_load to
+	 work properly in the presence of descriptors.  */
+      funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp,
+				     static_chain_value, &call_fusage,
+				     reg_parm_seen, flags);
+
       /* Now store any partially-in-registers parm.
 	 This is the last place a block-move can happen.  */
       if (reg_parm_seen)
@@ -3454,10 +3523,6 @@ expand_call (tree exp, rtx target, int i
 	}
 
       after_args = get_last_insn ();
-      funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp,
-				     static_chain_value, &call_fusage,
-				     reg_parm_seen, pass == 0);
-
       load_register_parameters (args, num_actuals, &call_fusage, flags,
 				pass == 0, &sibcall_failure);
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 237789)
+++ cfgexpand.c	(working copy)
@@ -2636,6 +2636,7 @@ expand_call_stmt (gcall *stmt)
   else
     CALL_FROM_THUNK_P (exp) = gimple_call_from_thunk_p (stmt);
   CALL_EXPR_VA_ARG_PACK (exp) = gimple_call_va_arg_pack_p (stmt);
+  CALL_EXPR_BY_DESCRIPTOR (exp) = gimple_call_by_descriptor_p (stmt);
   SET_EXPR_LOCATION (exp, gimple_location (stmt));
   CALL_WITH_BOUNDS_P (exp) = gimple_call_with_bounds_p (stmt);
 
Index: common.opt
===================================================================
--- common.opt	(revision 237789)
+++ common.opt	(working copy)
@@ -2303,6 +2303,10 @@ ftracer
 Common Report Var(flag_tracer) Optimization
 Perform superblock formation via tail duplication.
 
+ftrampolines
+Common Report Var(flag_trampolines) Init(0)
+Always generate trampolines for pointers to nested functions
+
 ; Zero means that floating-point math operations cannot generate a
 ; (user-visible) trap.  This is the case, for example, in nonstop
 ; IEEE 754 arithmetic.
Index: config/aarch64/aarch64.h
===================================================================
--- config/aarch64/aarch64.h	(revision 237789)
+++ config/aarch64/aarch64.h	(working copy)
@@ -779,6 +779,9 @@ typedef struct
    correctly.  */
 #define TRAMPOLINE_SECTION text_section
 
+/* Use custom descriptors instead of trampolines when possible.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
+
 /* To start with.  */
 #define BRANCH_COST(SPEED_P, PREDICTABLE_P) \
   (aarch64_branch_cost (SPEED_P, PREDICTABLE_P))
Index: config/alpha/alpha.h
===================================================================
--- config/alpha/alpha.h	(revision 237789)
+++ config/alpha/alpha.h	(working copy)
@@ -996,3 +996,6 @@ extern long alpha_auto_offset;
 #define NO_IMPLICIT_EXTERN_C
 
 #define TARGET_SUPPORTS_WIDE_INT 1
+
+/* Use custom descriptors instead of trampolines when possible if not VMS.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (TARGET_ABI_OPEN_VMS ? 0 : 1)
Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c	(revision 237789)
+++ config/arm/arm.c	(working copy)
@@ -6781,6 +6781,29 @@ arm_function_ok_for_sibcall (tree decl,
       && DECL_WEAK (decl))
     return false;
 
+  /* We cannot do a tailcall for an indirect call by descriptor if all the
+     argument registers are used because the only register left to load the
+     address is IP and it will already contain the static chain.  */
+  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
+    {
+      tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
+      CUMULATIVE_ARGS cum;
+      cumulative_args_t cum_v;
+
+      arm_init_cumulative_args (&cum, fntype, NULL_RTX, NULL_TREE);
+      cum_v = pack_cumulative_args (&cum);
+
+      for (tree t = TYPE_ARG_TYPES (fntype); t; t = TREE_CHAIN (t))
+	{
+	  tree type = TREE_VALUE (t);
+	  if (!VOID_TYPE_P (type))
+	    arm_function_arg_advance (cum_v, TYPE_MODE (type), type, true);
+	}
+
+      if (!arm_function_arg (cum_v, SImode, integer_type_node, true))
+	return false;
+    }
+
   /* Everything else is ok.  */
   return true;
 }
Index: config/arm/arm.h
===================================================================
--- config/arm/arm.h	(revision 237789)
+++ config/arm/arm.h	(working copy)
@@ -1632,6 +1632,10 @@ typedef struct
 
 /* Alignment required for a trampoline in bits.  */
 #define TRAMPOLINE_ALIGNMENT  32
+
+/* Use custom descriptors instead of trampolines when possible, but
+   we cannot use bit #0 because it is the ARM/Thumb selection bit.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2
 
 /* Addressing modes, and classification of registers for them.  */
 #define HAVE_POST_INCREMENT   1
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h	(revision 237789)
+++ config/i386/i386.h	(working copy)
@@ -2660,6 +2660,9 @@ extern void debug_dispatch_window (int);
 
 #define TARGET_SUPPORTS_WIDE_INT 1
 
+/* Use custom descriptors instead of trampolines when possible.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
+
 /*
 Local variables:
 version-control: t
Index: config/ia64/ia64.h
===================================================================
--- config/ia64/ia64.h	(revision 237789)
+++ config/ia64/ia64.h	(working copy)
@@ -1714,4 +1714,7 @@ struct GTY(()) machine_function
 /* Switch on code for querying unit reservations.  */
 #define CPU_UNITS_QUERY 1
 
+/* IA-64 already uses descriptors for its standard calling sequence.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0
+
 /* End of ia64.h */
Index: config/mips/mips.h
===================================================================
--- config/mips/mips.h	(revision 237789)
+++ config/mips/mips.h	(working copy)
@@ -3413,3 +3413,6 @@ struct GTY(())  machine_function {
 #define ENABLE_LD_ST_PAIRS \
   (TARGET_LOAD_STORE_PAIRS && (TUNE_P5600 || TUNE_I6400) \
    && !TARGET_MICROMIPS && !TARGET_FIX_24K)
+
+/* Use custom descriptors instead of trampolines when possible.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
Index: config/pa/pa.h
===================================================================
--- config/pa/pa.h	(revision 237789)
+++ config/pa/pa.h	(working copy)
@@ -1313,3 +1313,6 @@ do {									     \
    seven and four instructions, respectively.  */  
 #define MAX_PCREL17F_OFFSET \
   (flag_pic ? (TARGET_HPUX ? 198164 : 221312) : 240000)
+
+/* HP-PA already uses descriptors for its standard calling sequence.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0
Index: config/rs6000/rs6000.h
===================================================================
--- config/rs6000/rs6000.h	(revision 237789)
+++ config/rs6000/rs6000.h	(working copy)
@@ -2894,3 +2894,6 @@ extern GTY(()) tree rs6000_builtin_types
 extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
 
 #define TARGET_SUPPORTS_WIDE_INT 1
+
+/* Use custom descriptors instead of trampolines when possible if not AIX.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (DEFAULT_ABI == ABI_AIX ? 0 : 1)
Index: config/sparc/sparc.h
===================================================================
--- config/sparc/sparc.h	(revision 237789)
+++ config/sparc/sparc.h	(working copy)
@@ -1817,3 +1817,6 @@ extern int sparc_indent_opcode;
 #define SPARC_LOW_FE_EXCEPT_VALUES 0
 
 #define TARGET_SUPPORTS_WIDE_INT 1
+
+/* Use custom descriptors instead of trampolines when possible.  */
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
Index: defaults.h
===================================================================
--- defaults.h	(revision 237789)
+++ defaults.h	(working copy)
@@ -1080,9 +1080,18 @@ see the files COPYING3 and COPYING.RUNTI
 #define CASE_VECTOR_PC_RELATIVE 0
 #endif
 
+/* Force minimum alignment to be able to use the least significant bits
+   for distinguishing descriptor addresses from code addresses.  */
+#define DEFAULT_FUNCTION_ALIGNMENT					\
+  (lang_hooks.custom_function_descriptors				\
+   && targetm.calls.custom_function_descriptors > 0			\
+   ? MAX (FUNCTION_BOUNDARY,						\
+	  2 * targetm.calls.custom_function_descriptors * BITS_PER_UNIT)\
+   : FUNCTION_BOUNDARY)
+
 /* Assume that trampolines need function alignment.  */
 #ifndef TRAMPOLINE_ALIGNMENT
-#define TRAMPOLINE_ALIGNMENT FUNCTION_BOUNDARY
+#define TRAMPOLINE_ALIGNMENT DEFAULT_FUNCTION_ALIGNMENT
 #endif
 
 /* Register mappings for target machines without register windows.  */
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 237789)
+++ doc/invoke.texi	(working copy)
@@ -498,7 +498,7 @@ Objective-C and Objective-C++ Dialects}.
 -fverbose-asm  -fpack-struct[=@var{n}]  @gol
 -fleading-underscore  -ftls-model=@var{model} @gol
 -fstack-reuse=@var{reuse_level} @gol
--ftrapv  -fwrapv @gol
+-ftrampolines  -ftrapv  -fwrapv @gol
 -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
 -fstrict-volatile-bitfields -fsync-libcalls}
 
@@ -11546,6 +11546,31 @@ unit, or if @option{-fpic} is not given
 The default without @option{-fpic} is @samp{initial-exec}; with
 @option{-fpic} the default is @samp{global-dynamic}.
 
+@item -ftrampolines
+@opindex ftrampolines
+Always generate trampolines for pointers to nested functions.
+
+A trampoline is a small piece of data or code that is created at run
+time on the stack when the address of a nested function is taken, and
+is used to call the nested function indirectly.  For some targets, it
+is made up of data only and thus requires no special treatment.  But,
+for most targets, it is made up of code and thus requires the stack
+to be made executable in order for the program to work properly.
+
+@option{-fno-trampolines} is enabled by default to let the compiler avoid
+generating them if it computes that this is safe, on a case by case basis,
+and replace them with descriptors.  Descriptors are always made up of data
+only, but the generated code must be prepared to deal with them.
+
+This option has no effects for any other languages than Ada as of this
+writing.  Moreover, code compiled with @option{-ftrampolines} and code
+compiled with @option{-fno-trampolines} are not binary compatible if
+nested functions are present.  This option must therefore be used on
+a program-wide basis and be manipulated with extreme care.
+
+This option has no effects for targets whose trampolines are made up of
+data only, for example IA-64 targets, AIX or VMS platforms.
+
 @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 @opindex fvisibility
 Set the default ELF image symbol visibility to the specified option---all
Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 237789)
+++ doc/tm.texi	(working copy)
@@ -5181,6 +5181,25 @@ be returned; otherwise @var{addr} should
 If this hook is not defined, @var{addr} will be used for function calls.
 @end deftypefn
 
+@deftypevr {Target Hook} int TARGET_CUSTOM_FUNCTION_DESCRIPTORS
+This hook should be defined to a power of 2 if the target will benefit
+from the use of custom descriptors for nested functions instead of the
+standard trampolines.  Such descriptors are created at run time on the
+stack and made up of data only, but they are non-standard so the generated
+code must be prepared to deal with them.  This hook should be defined to 0
+if the target uses function descriptors for its standard calling sequence,
+like for example HP-PA or IA-64.  Using descriptors for nested functions
+eliminates the need for trampolines that reside on the stack and require
+it to be made executable.
+
+The value of the macro is used to parameterize the run-time identification
+scheme implemented to distinguish descriptors from function addresses: it
+gives the number of bytes by which their address is shifted in comparison
+with function addresses.  The value of 1 will generally work, unless it is
+already used by the target for a similar purpose, like for example on ARM
+where it is used to distinguish Thumb functions from ARM ones.
+@end deftypevr
+
 Implementing trampolines is difficult on many machines because they have
 separate instruction and data caches.  Writing into a stack location
 fails to clear the memory in the instruction cache, so when the program
Index: doc/tm.texi.in
===================================================================
--- doc/tm.texi.in	(revision 237789)
+++ doc/tm.texi.in	(working copy)
@@ -3947,6 +3947,8 @@ is used for aligning trampolines.
 
 @hook TARGET_TRAMPOLINE_ADJUST_ADDRESS
 
+@hook TARGET_CUSTOM_FUNCTION_DESCRIPTORS
+
 Implementing trampolines is difficult on many machines because they have
 separate instruction and data caches.  Writing into a stack location
 fails to clear the memory in the instruction cache, so when the program
Index: gimple.c
===================================================================
--- gimple.c	(revision 237789)
+++ gimple.c	(working copy)
@@ -373,6 +373,7 @@ gimple_build_call_from_tree (tree t)
     gimple_call_set_from_thunk (call, CALL_FROM_THUNK_P (t));
   gimple_call_set_va_arg_pack (call, CALL_EXPR_VA_ARG_PACK (t));
   gimple_call_set_nothrow (call, TREE_NOTHROW (t));
+  gimple_call_set_by_descriptor (call, CALL_EXPR_BY_DESCRIPTOR (t));
   gimple_set_no_warning (call, TREE_NO_WARNING (t));
   gimple_call_set_with_bounds (call, CALL_WITH_BOUNDS_P (t));
 
@@ -1386,6 +1387,9 @@ gimple_call_flags (const gimple *stmt)
   if (stmt->subcode & GF_CALL_NOTHROW)
     flags |= ECF_NOTHROW;
 
+  if (stmt->subcode & GF_CALL_BY_DESCRIPTOR)
+    flags |= ECF_BY_DESCRIPTOR;
+
   return flags;
 }
 
Index: gimple.h
===================================================================
--- gimple.h	(revision 237789)
+++ gimple.h	(working copy)
@@ -146,6 +146,7 @@ enum gf_mask {
     GF_CALL_CTRL_ALTERING       = 1 << 7,
     GF_CALL_WITH_BOUNDS 	= 1 << 8,
     GF_CALL_MUST_TAIL_CALL	= 1 << 9,
+    GF_CALL_BY_DESCRIPTOR	= 1 << 10,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
     GF_OMP_PARALLEL_GRID_PHONY = 1 << 1,
     GF_OMP_TASK_TASKLOOP	= 1 << 0,
@@ -3357,6 +3358,26 @@ gimple_call_alloca_for_var_p (gcall *s)
   return (s->subcode & GF_CALL_ALLOCA_FOR_VAR) != 0;
 }
 
+/* If BY_DESCRIPTOR_P is true, GIMPLE_CALL S is an indirect call for which
+   pointers to nested function are descriptors instead of trampolines.  */
+
+static inline void
+gimple_call_set_by_descriptor (gcall  *s, bool by_descriptor_p)
+{
+  if (by_descriptor_p)
+    s->subcode |= GF_CALL_BY_DESCRIPTOR;
+  else
+    s->subcode &= ~GF_CALL_BY_DESCRIPTOR;
+}
+
+/* Return true if S is a by-descriptor call.  */
+
+static inline bool
+gimple_call_by_descriptor_p (gcall *s)
+{
+  return (s->subcode & GF_CALL_BY_DESCRIPTOR) != 0;
+}
+
 /* Copy all the GF_CALL_* flags from ORIG_CALL to DEST_CALL.  */
 
 static inline void
Index: langhooks-def.h
===================================================================
--- langhooks-def.h	(revision 237789)
+++ langhooks-def.h	(working copy)
@@ -120,6 +120,7 @@ extern bool lhd_omp_mappable_type (tree)
 #define LANG_HOOKS_BLOCK_MAY_FALLTHRU	hook_bool_const_tree_true
 #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP	false
 #define LANG_HOOKS_DEEP_UNSHARING	false
+#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS	false
 
 /* Attribute hooks.  */
 #define LANG_HOOKS_ATTRIBUTE_TABLE		NULL
@@ -319,7 +320,8 @@ extern void lhd_end_section (void);
   LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
   LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
   LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
-  LANG_HOOKS_DEEP_UNSHARING \
+  LANG_HOOKS_DEEP_UNSHARING, \
+  LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS \
 }
 
 #endif /* GCC_LANG_HOOKS_DEF_H */
Index: langhooks.h
===================================================================
--- langhooks.h	(revision 237789)
+++ langhooks.h	(working copy)
@@ -505,6 +505,10 @@ struct lang_hooks
      gimplification.  */
   bool deep_unsharing;
 
+  /* True if this language may use custom descriptors for nested functions
+     instead of trampolines.  */
+  bool custom_function_descriptors;
+
   /* Whenever you add entries here, make sure you adjust langhooks-def.h
      and langhooks.c accordingly.  */
 };
Index: rtl.h
===================================================================
--- rtl.h	(revision 237789)
+++ rtl.h	(working copy)
@@ -317,6 +317,7 @@ struct GTY((desc("0"), tag("0"),
      1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c.
      1 in a VALUE is SP_BASED_VALUE_P in cselib.c.
      1 in a SUBREG generated by LRA for reload insns.
+     1 in a REG if this is a static chain register.
      1 in a CALL for calls instrumented by Pointer Bounds Checker.  */
   unsigned int jump : 1;
   /* In a CODE_LABEL, part of the two-bit alternate entry field.
@@ -2264,6 +2265,10 @@ do {								        \
  : (SIGN) == SRP_SIGNED ? SUBREG_PROMOTED_SIGNED_P (RTX)		\
  : SUBREG_PROMOTED_UNSIGNED_P (RTX))
 
+/* True if the REG is the static chain register for some CALL_INSN.  */
+#define STATIC_CHAIN_REG_P(RTX)	\
+  (RTL_FLAG_CHECK1 ("STATIC_CHAIN_REG_P", (RTX), REG)->jump)
+
 /* True if the subreg was generated by LRA for reload insns.  Such
    subregs are valid only during LRA.  */
 #define LRA_SUBREG_P(RTX)	\
Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 237789)
+++ rtlanal.c	(working copy)
@@ -3914,7 +3914,8 @@ find_first_parameter_load (rtx_insn *cal
   parm.nregs = 0;
   for (p = CALL_INSN_FUNCTION_USAGE (call_insn); p; p = XEXP (p, 1))
     if (GET_CODE (XEXP (p, 0)) == USE
-	&& REG_P (XEXP (XEXP (p, 0), 0)))
+	&& REG_P (XEXP (XEXP (p, 0), 0))
+	&& !STATIC_CHAIN_REG_P (XEXP (XEXP (p, 0), 0)))
       {
 	gcc_assert (REGNO (XEXP (XEXP (p, 0), 0)) < FIRST_PSEUDO_REGISTER);
 
Index: target.def
===================================================================
--- target.def	(revision 237789)
+++ target.def	(working copy)
@@ -4723,6 +4723,26 @@ be returned; otherwise @var{addr} should
 If this hook is not defined, @var{addr} will be used for function calls.",
  rtx, (rtx addr), NULL)
 
+DEFHOOKPOD
+(custom_function_descriptors,
+ "This hook should be defined to a power of 2 if the target will benefit\n\
+from the use of custom descriptors for nested functions instead of the\n\
+standard trampolines.  Such descriptors are created at run time on the\n\
+stack and made up of data only, but they are non-standard so the generated\n\
+code must be prepared to deal with them.  This hook should be defined to 0\n\
+if the target uses function descriptors for its standard calling sequence,\n\
+like for example HP-PA or IA-64.  Using descriptors for nested functions\n\
+eliminates the need for trampolines that reside on the stack and require\n\
+it to be made executable.\n\
+\n\
+The value of the macro is used to parameterize the run-time identification\n\
+scheme implemented to distinguish descriptors from function addresses: it\n\
+gives the number of bytes by which their address is shifted in comparison\n\
+with function addresses.  The value of 1 will generally work, unless it is\n\
+already used by the target for a similar purpose, like for example on ARM\n\
+where it is used to distinguish Thumb functions from ARM ones.",
+ int, -1)
+
 /* Return the number of bytes of its own arguments that a function
    pops on returning, or 0 if the function pops no arguments and the
    caller must therefore pop them all after the function returns.  */
Index: testsuite/gnat.dg/trampoline3.adb
===================================================================
--- testsuite/gnat.dg/trampoline3.adb	(revision 0)
+++ testsuite/gnat.dg/trampoline3.adb	(working copy)
@@ -0,0 +1,22 @@
+-- { dg-do compile { target *-*-linux* } }
+-- { dg-options "-gnatws" }
+
+procedure Trampoline3 is
+
+  A : Integer;
+
+  type FuncPtr is access function (I : Integer) return Integer;
+
+  function F (I : Integer) return Integer is
+  begin
+    return A + I;
+  end F;
+
+  P : FuncPtr := F'Access;
+  I : Integer;
+
+begin
+  I := P(0);
+end;
+
+-- { dg-final { scan-assembler-not "GNU-stack.*x" } }
Index: testsuite/gnat.dg/trampoline4.adb
===================================================================
--- testsuite/gnat.dg/trampoline4.adb	(revision 0)
+++ testsuite/gnat.dg/trampoline4.adb	(working copy)
@@ -0,0 +1,22 @@
+-- { dg-do compile { target *-*-linux* } }
+-- { dg-options "-ftrampolines -gnatws" }
+
+procedure Trampoline4 is
+
+  A : Integer;
+
+  type FuncPtr is access function (I : Integer) return Integer;
+
+  function F (I : Integer) return Integer is
+  begin
+    return A + I;
+  end F;
+
+  P : FuncPtr := F'Access;
+  I : Integer;
+
+begin
+  I := P(0);
+end;
+
+-- { dg-final { scan-assembler "GNU-stack.*x" } }
Index: tree-core.h
===================================================================
--- tree-core.h	(revision 237789)
+++ tree-core.h	(working copy)
@@ -90,6 +90,9 @@ struct die_struct;
 /* Nonzero if this call is into the transaction runtime library.  */
 #define ECF_TM_BUILTIN		  (1 << 13)
 
+/* Nonzero if this is an indirect call by descriptor.  */
+#define ECF_BY_DESCRIPTOR	  (1 << 14)
+
 /* Call argument flags.  */
 /* Nonzero if the argument is not dereferenced recursively, thus only
    directly reachable memory is read or written.  */
@@ -1216,6 +1219,12 @@ struct GTY(()) tree_base {
 
        REF_REVERSE_STORAGE_ORDER in
            BIT_FIELD_REF, MEM_REF
+
+       FUNC_ADDR_BY_DESCRIPTOR in
+           ADDR_EXPR
+
+       CALL_EXPR_BY_DESCRIPTOR in
+           CALL_EXPR
 */
 
 struct GTY(()) tree_typed {
Index: tree-nested.c
===================================================================
--- tree-nested.c	(revision 237789)
+++ tree-nested.c	(working copy)
@@ -21,6 +21,7 @@
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
+#include "target.h"
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
@@ -103,6 +104,7 @@ struct nesting_info
 
   bool any_parm_remapped;
   bool any_tramp_created;
+  bool any_descr_created;
   char static_chain_added;
 };
 
@@ -486,12 +488,40 @@ get_trampoline_type (struct nesting_info
   return trampoline_type;
 }
 
-/* Given DECL, a nested function, find or create a field in the non-local
-   frame structure for a trampoline for this function.  */
+/* Build or return the type used to represent a nested function descriptor.  */
+
+static GTY(()) tree descriptor_type;
 
 static tree
-lookup_tramp_for_decl (struct nesting_info *info, tree decl,
-		       enum insert_option insert)
+get_descriptor_type (struct nesting_info *info)
+{
+  tree t;
+
+  if (descriptor_type)
+    return descriptor_type;
+
+  t = build_index_type (build_int_cst (NULL_TREE, 2 * UNITS_PER_WORD - 1));
+  t = build_array_type (char_type_node, t);
+  t = build_decl (DECL_SOURCE_LOCATION (info->context),
+		  FIELD_DECL, get_identifier ("__data"), t);
+  SET_DECL_ALIGN (t, BITS_PER_WORD);
+  DECL_USER_ALIGN (t) = 1;
+
+  descriptor_type = make_node (RECORD_TYPE);
+  TYPE_NAME (descriptor_type) = get_identifier ("__builtin_descriptor");
+  TYPE_FIELDS (descriptor_type) = t;
+  layout_type (descriptor_type);
+  DECL_CONTEXT (t) = descriptor_type;
+
+  return descriptor_type;
+}
+
+/* Given DECL, a nested function, find or create an element in the
+   var map for this function.  */
+
+static tree
+lookup_element_for_decl (struct nesting_info *info, tree decl,
+			 enum insert_option insert)
 {
   if (insert == NO_INSERT)
     {
@@ -501,19 +531,73 @@ lookup_tramp_for_decl (struct nesting_in
 
   tree *slot = &info->var_map->get_or_insert (decl);
   if (!*slot)
-    {
-      tree field = make_node (FIELD_DECL);
-      DECL_NAME (field) = DECL_NAME (decl);
-      TREE_TYPE (field) = get_trampoline_type (info);
-      TREE_ADDRESSABLE (field) = 1;
+    *slot = build_tree_list (NULL_TREE, NULL_TREE);
 
-      insert_field_into_struct (get_frame_type (info), field);
-      *slot = field;
+  return (tree) *slot;
+} 
+
+/* Given DECL, a nested function, create a field in the non-local
+   frame structure for this function.  */
+
+static tree
+create_field_for_decl (struct nesting_info *info, tree decl, tree type)
+{
+  tree field = make_node (FIELD_DECL);
+  DECL_NAME (field) = DECL_NAME (decl);
+  TREE_TYPE (field) = type;
+  TREE_ADDRESSABLE (field) = 1;
+  insert_field_into_struct (get_frame_type (info), field);
+  return field;
+}
+
+/* Given DECL, a nested function, find or create a field in the non-local
+   frame structure for a trampoline for this function.  */
+
+static tree
+lookup_tramp_for_decl (struct nesting_info *info, tree decl,
+		       enum insert_option insert)
+{
+  tree elt, field;
+
+  elt = lookup_element_for_decl (info, decl, insert);
+  if (!elt)
+    return NULL_TREE;
+
+  field = TREE_PURPOSE (elt);
 
+  if (!field && insert == INSERT)
+    {
+      field = create_field_for_decl (info, decl, get_trampoline_type (info));
+      TREE_PURPOSE (elt) = field;
       info->any_tramp_created = true;
     }
 
-  return *slot;
+  return field;
+}
+
+/* Given DECL, a nested function, find or create a field in the non-local
+   frame structure for a descriptor for this function.  */
+
+static tree
+lookup_descr_for_decl (struct nesting_info *info, tree decl,
+		       enum insert_option insert)
+{
+  tree elt, field;
+
+  elt = lookup_element_for_decl (info, decl, insert);
+  if (!elt)
+    return NULL_TREE;
+
+  field = TREE_VALUE (elt);
+
+  if (!field && insert == INSERT)
+    {
+      field = create_field_for_decl (info, decl, get_descriptor_type (info));
+      TREE_VALUE (elt) = field;
+      info->any_descr_created = true;
+    }
+
+  return field;
 }
 
 /* Build or return the field within the non-local frame state that holds
@@ -2303,6 +2387,7 @@ convert_tramp_reference_op (tree *tp, in
   struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
   struct nesting_info *const info = (struct nesting_info *) wi->info, *i;
   tree t = *tp, decl, target_context, x, builtin;
+  bool descr;
   gcall *call;
 
   *walk_subtrees = 0;
@@ -2337,7 +2422,14 @@ convert_tramp_reference_op (tree *tp, in
 	 we need to insert the trampoline.  */
       for (i = info; i->context != target_context; i = i->outer)
 	continue;
-      x = lookup_tramp_for_decl (i, decl, INSERT);
+
+      /* Decide whether to generate a descriptor or a trampoline. */
+      descr = FUNC_ADDR_BY_DESCRIPTOR (t) && !flag_trampolines;
+
+      if (descr)
+	x = lookup_descr_for_decl (i, decl, INSERT);
+      else
+	x = lookup_tramp_for_decl (i, decl, INSERT);
 
       /* Compute the address of the field holding the trampoline.  */
       x = get_frame_field (info, target_context, x, &wi->gsi);
@@ -2346,7 +2438,10 @@ convert_tramp_reference_op (tree *tp, in
 
       /* Do machine-specific ugliness.  Normally this will involve
 	 computing extra alignment, but it can really be anything.  */
-      builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
+      if (descr)
+	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+      else
+	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
       call = gimple_build_call (builtin, 1, x);
       x = init_tmp_var_with_call (info, &wi->gsi, call);
 
@@ -2820,6 +2915,27 @@ fold_mem_refs (tree *const &e, void *dat
   return true;
 }
 
+/* Given DECL, a nested function, build an initialization call for FIELD,
+   the trampoline or descriptor for DECL, using FUNC as the function.  */
+
+static gcall *
+build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
+		      tree func)
+{
+  tree arg1, arg2, arg3, x;
+
+  gcc_assert (DECL_STATIC_CHAIN (decl));
+  arg3 = build_addr (info->frame_decl);
+
+  arg2 = build_addr (decl);
+
+  x = build3 (COMPONENT_REF, TREE_TYPE (field),
+	      info->frame_decl, field, NULL_TREE);
+  arg1 = build_addr (x);
+
+  return gimple_build_call (func, 3, arg1, arg2, arg3);
+}
+
 /* Do "everything else" to clean up or complete state collected by the various
    walking passes -- create a field to hold the frame base address, lay out the
    types and decls, generate code to initialize the frame decl, store critical
@@ -2965,23 +3081,32 @@ finalize_nesting_tree_1 (struct nesting_
       struct nesting_info *i;
       for (i = root->inner; i ; i = i->next)
 	{
-	  tree arg1, arg2, arg3, x, field;
+	  tree field, x;
 
 	  field = lookup_tramp_for_decl (root, i->context, NO_INSERT);
 	  if (!field)
 	    continue;
 
-	  gcc_assert (DECL_STATIC_CHAIN (i->context));
-	  arg3 = build_addr (root->frame_decl);
+	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
+	  stmt = build_init_call_stmt (root, i->context, field, x);
+	  gimple_seq_add_stmt (&stmt_list, stmt);
+	}
+    }
 
-	  arg2 = build_addr (i->context);
+  /* If descriptors were created, then we need to initialize them.  */
+  if (root->any_descr_created)
+    {
+      struct nesting_info *i;
+      for (i = root->inner; i ; i = i->next)
+	{
+	  tree field, x;
 
-	  x = build3 (COMPONENT_REF, TREE_TYPE (field),
-		      root->frame_decl, field, NULL_TREE);
-	  arg1 = build_addr (x);
+	  field = lookup_descr_for_decl (root, i->context, NO_INSERT);
+	  if (!field)
+	    continue;
 
-	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
-	  stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
+	  x = builtin_decl_implicit (BUILT_IN_INIT_DESCRIPTOR);
+	  stmt = build_init_call_stmt (root, i->context, field, x);
 	  gimple_seq_add_stmt (&stmt_list, stmt);
 	}
     }
Index: tree.c
===================================================================
--- tree.c	(revision 237789)
+++ tree.c	(working copy)
@@ -1019,7 +1019,7 @@ make_node_stat (enum tree_code code MEM_
 	{
 	  if (code == FUNCTION_DECL)
 	    {
-	      SET_DECL_ALIGN (t, FUNCTION_BOUNDARY);
+	      SET_DECL_ALIGN (t, DEFAULT_FUNCTION_ALIGNMENT);
 	      DECL_MODE (t) = FUNCTION_MODE;
 	    }
 	  else
@@ -10567,12 +10567,19 @@ build_common_builtin_nodes (void)
 			BUILT_IN_INIT_HEAP_TRAMPOLINE,
 			"__builtin_init_heap_trampoline",
 			ECF_NOTHROW | ECF_LEAF);
+  local_define_builtin ("__builtin_init_descriptor", ftype,
+			BUILT_IN_INIT_DESCRIPTOR,
+			"__builtin_init_descriptor", ECF_NOTHROW | ECF_LEAF);
 
   ftype = build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE);
   local_define_builtin ("__builtin_adjust_trampoline", ftype,
 			BUILT_IN_ADJUST_TRAMPOLINE,
 			"__builtin_adjust_trampoline",
 			ECF_CONST | ECF_NOTHROW);
+  local_define_builtin ("__builtin_adjust_descriptor", ftype,
+			BUILT_IN_ADJUST_DESCRIPTOR,
+			"__builtin_adjust_descriptor",
+			ECF_CONST | ECF_NOTHROW);
 
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node, NULL_TREE);
Index: tree.h
===================================================================
--- tree.h	(revision 237789)
+++ tree.h	(working copy)
@@ -967,6 +967,16 @@ extern void omp_clause_range_check_faile
 #define REF_REVERSE_STORAGE_ORDER(NODE) \
   (TREE_CHECK2 (NODE, BIT_FIELD_REF, MEM_REF)->base.default_def_flag)
 
+  /* In an ADDR_EXPR, indicates that this is a pointer to nested function
+   represented by a descriptor instead of a trampoline.  */
+#define FUNC_ADDR_BY_DESCRIPTOR(NODE) \
+  (TREE_CHECK (NODE, ADDR_EXPR)->base.default_def_flag)
+
+/* In a CALL_EXPR, indicates that this is an indirect call for which
+   pointers to nested function are descriptors instead of trampolines.  */
+#define CALL_EXPR_BY_DESCRIPTOR(NODE) \
+  (TREE_CHECK (NODE, CALL_EXPR)->base.default_def_flag)
+
 /* These flags are available for each language front end to use internally.  */
 #define TREE_LANG_FLAG_0(NODE) \
   (TREE_NOT_CHECK2 (NODE, TREE_VEC, SSA_NAME)->base.u.bits.lang_flag_0)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]