This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] Get rid of stack trampolines for nested functions


On 29/06/16 23:08, Eric Botcazou wrote:
> Index: config/aarch64/aarch64.h
> ===================================================================
> --- config/aarch64/aarch64.h	(revision 237789)
> +++ config/aarch64/aarch64.h	(working copy)
> @@ -779,6 +779,9 @@ typedef struct
>     correctly.  */
>  #define TRAMPOLINE_SECTION text_section
>  
> +/* Use custom descriptors instead of trampolines when possible.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
> +


Eric,

If I understand how this is supposed to work then this is not
future-proof against changes to the architecture.  The bottom two bits
in both AArch32 (arm) and AArch64 are reserved for future use by the
architecture; they must not be used by software for tricks like this.
As has already been seen in AArch32 state, bit-0 is used to indicate the
ARM/Thumb ISA selection.

The patch to arm.h is similarly problematic in this regard.

R.

> Hi,
> 
> this patch implements generic support for the elimination of stack trampolines 
> and, consequently, of the need to make the stack executable when pointers to 
> nested functions are used.  That's done on a per-language and per-target basis 
> (i.e. there is 1 language hook and 1 target hook to parameterize it) and there 
> are no changes whatsoever in code generation if both are not turned on (and 
> the patch implements a -ftrampolines option to let the user override them).
> 
> The idea is based on the fact that, for targets using function descriptors as 
> per their ABI like IA-64, AIX or VMS platforms, stack trampolines "degenerate" 
> into descriptors built at run time on the stack and thus made up of data only, 
> which in turn means that the stack doesn't need to be made executable.
> 
> This descriptor-based scheme is implemented generically for nested functions, 
> i.e. the nested function lowering pass builds generic descriptors instead of 
> trampolines on the stack when encountering pointers to nested functions, which 
> means that there are 2 kinds of pointers to functions and therefore a run-time 
> identification mechanism is needed for indirect calls to distinguish them.
> 
> Because of that, enabling the support breaks binary compatibility (for code 
> manipulating pointers to nested functions).  That's OK for Ada and nested 
> functions are first-class citizens in the language anyway so we really need 
> this, but not for C so for example Ada doesn't use it at the interface with C 
> (when objects have "convention C" in Ada parlance).
> 
> This was bootstrapped/regtested on x86_64-suse-linux but AdaCore has been 
> using it on native platforms (Linux, Windows, Solaris, etc) for years.
> 
> OK for the mainline?
> 
> 
> 2016-06-29  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	PR ada/37139
> 	PR ada/67205
> 	* common.opt (-ftrampolines): New option.
> 	* doc/invoke.texi (Code Gen Options): Document it.
> 	* doc/tm.texi.in (Trampolines): Add TARGET_CUSTOM_FUNCTION_DESCRIPTORS
> 	* doc/tm.texi: Regenerate.
> 	* builtins.def: Add init_descriptor and adjust_descriptor.
> 	* builtins.c (expand_builtin_init_trampoline): Do not issue a warning
> 	on platforms with descriptors.
> 	(expand_builtin_init_descriptor): New function.
> 	(expand_builtin_adjust_descriptor): Likewise.
> 	(expand_builtin) <BUILT_IN_INIT_DESCRIPTOR>: New case.
> 	<BUILT_IN_ADJUST_DESCRIPTOR>: Likewise.
> 	* calls.c (prepare_call_address): Remove SIBCALLP parameter and add
> 	FLAGS parameter.  Deal with indirect calls by descriptor and adjust.
> 	Set STATIC_CHAIN_REG_P on the static chain register, if any.
> 	(call_expr_flags): Set ECF_BY_DESCRIPTOR for calls by descriptor.
> 	(expand_call): Likewise.  Move around call to prepare_call_address
> 	and pass all flags to it.
> 	* cfgexpand.c (expand_call_stmt): Reinstate CALL_EXPR_BY_DESCRIPTOR.
> 	* gimple.h (enum gf_mask): New GF_CALL_BY_DESCRIPTOR value.
> 	(gimple_call_set_by_descriptor): New setter.
> 	(gimple_call_by_descriptor_p): New getter.
> 	* gimple.c (gimple_build_call_from_tree): Set CALL_EXPR_BY_DESCRIPTOR.
> 	(gimple_call_flags): Deal with GF_CALL_BY_DESCRIPTOR.
> 	* langhooks.h (struct lang_hooks): Add custom_function_descriptors.
> 	* langhooks-def.h (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS): Define.
> 	(LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS.
> 	* rtl.h (STATIC_CHAIN_REG_P): New macro.
> 	* rtlanal.c (find_first_parameter_load): Skip static chain registers.
> 	* target.def (custom_function_descriptors): New POD hook.
> 	* tree.h (FUNC_ADDR_BY_DESCRIPTOR): New flag on ADDR_EXPR.
> 	(CALL_EXPR_BY_DESCRIPTOR): New flag on CALL_EXPR.
> 	* tree-core.h (ECF_BY_DESCRIPTOR): New mask.
> 	Document FUNC_ADDR_BY_DESCRIPTOR and CALL_EXPR_BY_DESCRIPTOR.
> 	* tree.c (make_node_stat) <tcc_declaration>: Set function alignment to
> 	DEFAULT_FUNCTION_ALIGNMENT instead of FUNCTION_BOUNDARY.
> 	(build_common_builtin_nodes): Initialize init_descriptor and
> 	adjust_descriptor.
> 	* tree-nested.c: Include target.h.
> 	(struct nesting_info): Add 'any_descr_created' field.
> 	(get_descriptor_type): New function.
> 	(lookup_element_for_decl): New function extracted from...
> 	(create_field_for_decl): Likewise.
> 	(lookup_tramp_for_decl): ...here.  Adjust.
> 	(lookup_descr_for_decl): New function.
> 	(convert_tramp_reference_op): Deal with descriptors.
> 	(build_init_call_stmt): New function extracted from...
> 	(finalize_nesting_tree_1): ...here.  Adjust and deal with descriptors.
> 	* defaults.h (DEFAULT_FUNCTION_ALIGNMENT): Define.
> 	(TRAMPOLINE_ALIGNMENT): Set to above instead of FUNCTION_BOUNDARY.
> 	* config/aarch64/aarch64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Define
> 	* config/alpha/alpha.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> 	* config/arm/arm.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> 	* config/arm/arm.c (arm_function_ok_for_sibcall): Return false for an
> 	indirect call by descriptor if all the argument registers are used.
> 	* config/i386/i386.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Define.
> 	* config/ia64/ia64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> 	* config/mips/mips.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> 	* config/pa/pa.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> 	* config/rs6000/rs6000.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Likewise
> 	* config/sparc/sparc.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise.
> ada/
> 	* gcc-interface/misc.c (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS):Define
> 	* gcc-interface/trans.c (Attribute_to_gnu) <Attr_Access>: Deal with
> 	a zero  TARGET_CUSTOM_FUNCTION_DESCRIPTORSspecially for 'Code_Address.
> 	Otherwise, if TARGET_CUSTOM_FUNCTION_DESCRIPTORS is positive, set
> 	FUNC_ADDR_BY_DESCRIPTOR for 'Access/'Unrestricted_Access of nested
> 	subprograms if the type can use an internal representation.
> 	(call_to_gnu): Likewise, but set CALL_EXPR_BY_DESCRIPTOR on indirect
> 	calls if the type can use an internal representation.
> 
> 
> 2016-06-29  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* gnat.dg/trampoline3.adb: New test.
> 	* gnat.dg/trampoline4.adb: Likewise.
> 
> 
> p.diff
> 
> 
> Index: ada/gcc-interface/misc.c
> ===================================================================
> --- ada/gcc-interface/misc.c	(revision 237848)
> +++ ada/gcc-interface/misc.c	(working copy)
> @@ -1416,6 +1416,8 @@ get_lang_specific (tree node)
>  #define LANG_HOOKS_EH_PERSONALITY	gnat_eh_personality
>  #undef  LANG_HOOKS_DEEP_UNSHARING
>  #define LANG_HOOKS_DEEP_UNSHARING	true
> +#undef  LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS
> +#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS true
>  
>  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
>  
> Index: ada/gcc-interface/trans.c
> ===================================================================
> --- ada/gcc-interface/trans.c	(revision 237850)
> +++ ada/gcc-interface/trans.c	(working copy)
> @@ -1702,6 +1702,17 @@ Attribute_to_gnu (Node_Id gnat_node, tre
>  
>  	  if (TREE_CODE (gnu_expr) == ADDR_EXPR)
>  	    TREE_NO_TRAMPOLINE (gnu_expr) = TREE_CONSTANT (gnu_expr) = 1;
> +
> +	  /* On targets for which function symbols denote a descriptor, the
> +	     code address is stored within the first slot of the descriptor
> +	     so we do an additional dereference:
> +	       result = *((result_type *) result)
> +	     where we expect result to be of some pointer type already.  */
> +	  if (targetm.calls.custom_function_descriptors == 0)
> +	    gnu_result
> +	      = build_unary_op (INDIRECT_REF, NULL_TREE,
> +				convert (build_pointer_type (gnu_result_type),
> +					 gnu_result));
>  	}
>  
>        /* For 'Access, issue an error message if the prefix is a C++ method
> @@ -1728,10 +1739,19 @@ Attribute_to_gnu (Node_Id gnat_node, tre
>  	      /* Also check the inlining status.  */
>  	      check_inlining_for_nested_subprog (TREE_OPERAND (gnu_expr, 0));
>  
> -	      /* Check that we're not violating the No_Implicit_Dynamic_Code
> -		 restriction.  Be conservative if we don't know anything
> -		 about the trampoline strategy for the target.  */
> -	      Check_Implicit_Dynamic_Code_Allowed (gnat_node);
> +	      /* Moreover, for 'Access or 'Unrestricted_Access with non-
> +		 foreign-compatible representation, mark the ADDR_EXPR so
> +		 that we can build a descriptor instead of a trampoline.  */
> +	      if ((attribute == Attr_Access
> +		   || attribute == Attr_Unrestricted_Access)
> +		  && targetm.calls.custom_function_descriptors > 0
> +		  && Can_Use_Internal_Rep (Etype (gnat_node)))
> +		FUNC_ADDR_BY_DESCRIPTOR (gnu_expr) = 1;
> +
> +	      /* Otherwise, we need to check that we are not violating the
> +		 No_Implicit_Dynamic_Code restriction.  */
> +	      else if (targetm.calls.custom_function_descriptors != 0)
> +	        Check_Implicit_Dynamic_Code_Allowed (gnat_node);
>  	    }
>  	}
>        break;
> @@ -4228,6 +4248,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
>    tree gnu_after_list = NULL_TREE;
>    tree gnu_retval = NULL_TREE;
>    tree gnu_call, gnu_result;
> +  bool by_descriptor = false;
>    bool went_into_elab_proc = false;
>    bool pushed_binding_level = false;
>    Entity_Id gnat_formal;
> @@ -4267,7 +4288,15 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
>       type the access type is pointing to.  Otherwise, get the formals from the
>       entity being called.  */
>    if (Nkind (Name (gnat_node)) == N_Explicit_Dereference)
> -    gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node)));
> +    {
> +      gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node)));
> +
> +      /* If the access type doesn't require foreign-compatible representation,
> +	 be prepared for descriptors.  */
> +      if (targetm.calls.custom_function_descriptors > 0
> +	  && Can_Use_Internal_Rep (Etype (Prefix (Name (gnat_node)))))
> +	by_descriptor = true;
> +    }
>    else if (Nkind (Name (gnat_node)) == N_Attribute_Reference)
>      /* Assume here that this must be 'Elab_Body or 'Elab_Spec.  */
>      gnat_formal = Empty;
> @@ -4668,6 +4697,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn
>  
>    gnu_call
>      = build_call_vec (gnu_result_type, gnu_subprog_addr, gnu_actual_vec);
> +  CALL_EXPR_BY_DESCRIPTOR (gnu_call) = by_descriptor;
>    set_expr_location_from_node (gnu_call, gnat_node);
>  
>    /* If we have created a temporary for the return value, initialize it.  */
> Index: builtins.c
> ===================================================================
> --- builtins.c	(revision 237789)
> +++ builtins.c	(working copy)
> @@ -4621,8 +4621,9 @@ expand_builtin_init_trampoline (tree exp
>      {
>        trampolines_created = 1;
>  
> -      warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines,
> -		  "trampoline generated for nested function %qD", t_func);
> +      if (targetm.calls.custom_function_descriptors != 0)
> +	warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines,
> +		    "trampoline generated for nested function %qD", t_func);
>      }
>  
>    return const0_rtx;
> @@ -4644,6 +4645,57 @@ expand_builtin_adjust_trampoline (tree e
>    return tramp;
>  }
>  
> +/* Expand a call to the builtin descriptor initialization routine.
> +   A descriptor is made up of a couple of pointers to the static
> +   chain and the code entry in this order.  */
> +
> +static rtx
> +expand_builtin_init_descriptor (tree exp)
> +{
> +  tree t_descr, t_func, t_chain;
> +  rtx m_descr, r_descr, r_func, r_chain;
> +
> +  if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, POINTER_TYPE,
> +			 VOID_TYPE))
> +    return NULL_RTX;
> +
> +  t_descr = CALL_EXPR_ARG (exp, 0);
> +  t_func = CALL_EXPR_ARG (exp, 1);
> +  t_chain = CALL_EXPR_ARG (exp, 2);
> +
> +  r_descr = expand_normal (t_descr);
> +  m_descr = gen_rtx_MEM (BLKmode, r_descr);
> +  MEM_NOTRAP_P (m_descr) = 1;
> +
> +  r_func = expand_normal (t_func);
> +  r_chain = expand_normal (t_chain);
> +
> +  /* Generate insns to initialize the descriptor.  */
> +  emit_move_insn (adjust_address_nv (m_descr, Pmode, 0), r_chain);
> +  emit_move_insn (adjust_address_nv (m_descr, Pmode, UNITS_PER_WORD), r_func);
> +
> +  return const0_rtx;
> +}
> +
> +/* Expand a call to the builtin descriptor adjustment routine.  */
> +
> +static rtx
> +expand_builtin_adjust_descriptor (tree exp)
> +{
> +  rtx tramp;
> +
> +  if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE))
> +    return NULL_RTX;
> +
> +  tramp = expand_normal (CALL_EXPR_ARG (exp, 0));
> +
> +  /* Unalign the descriptor to allow runtime identification.  */
> +  tramp
> +    = plus_constant (Pmode, tramp, targetm.calls.custom_function_descriptors);
> +
> +  return force_operand (tramp, NULL_RTX);
> +}
> +
>  /* Expand the call EXP to the built-in signbit, signbitf or signbitl
>     function.  The function first checks whether the back end provides
>     an insn to implement signbit for the respective mode.  If not, it
> @@ -6221,6 +6273,11 @@ expand_builtin (tree exp, rtx target, rt
>      case BUILT_IN_ADJUST_TRAMPOLINE:
>        return expand_builtin_adjust_trampoline (exp);
>  
> +    case BUILT_IN_INIT_DESCRIPTOR:
> +      return expand_builtin_init_descriptor (exp);
> +    case BUILT_IN_ADJUST_DESCRIPTOR:
> +      return expand_builtin_adjust_descriptor (exp);
> +
>      case BUILT_IN_FORK:
>      case BUILT_IN_EXECL:
>      case BUILT_IN_EXECV:
> Index: builtins.def
> ===================================================================
> --- builtins.def	(revision 237789)
> +++ builtins.def	(working copy)
> @@ -856,6 +856,8 @@ DEF_C99_BUILTIN        (BUILT_IN__EXIT2,
>  DEF_BUILTIN_STUB (BUILT_IN_INIT_TRAMPOLINE, "__builtin_init_trampoline")
>  DEF_BUILTIN_STUB (BUILT_IN_INIT_HEAP_TRAMPOLINE, "__builtin_init_heap_trampoline")
>  DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
> +DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
> +DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
>  DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
>  
>  /* Implementing __builtin_setjmp.  */
> Index: calls.c
> ===================================================================
> --- calls.c	(revision 237789)
> +++ calls.c	(working copy)
> @@ -183,18 +183,73 @@ static void restore_fixed_argument_area
>  
>  rtx
>  prepare_call_address (tree fndecl_or_type, rtx funexp, rtx static_chain_value,
> -		      rtx *call_fusage, int reg_parm_seen, int sibcallp)
> +		      rtx *call_fusage, int reg_parm_seen, int flags)
>  {
>    /* Make a valid memory address and copy constants through pseudo-regs,
>       but not for a constant address if -fno-function-cse.  */
>    if (GET_CODE (funexp) != SYMBOL_REF)
> -    /* If we are using registers for parameters, force the
> -       function address into a register now.  */
> -    funexp = ((reg_parm_seen
> -	       && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
> -	      ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
> -	      : memory_address (FUNCTION_MODE, funexp));
> -  else if (! sibcallp)
> +    {
> +      /* If it's an indirect call by descriptor, generate code to perform
> +	 runtime identification of the pointer and load the descriptor.  */
> +      if ((flags & ECF_BY_DESCRIPTOR) && !flag_trampolines)
> +	{
> +	  const int bit_val = targetm.calls.custom_function_descriptors;
> +	  rtx call_lab = gen_label_rtx ();
> +
> +	  gcc_assert (fndecl_or_type && TYPE_P (fndecl_or_type));
> +	  fndecl_or_type
> +	    = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE,
> +			  fndecl_or_type);
> +	  DECL_STATIC_CHAIN (fndecl_or_type) = 1;
> +	  rtx chain = targetm.calls.static_chain (fndecl_or_type, false);
> +
> +	  /* Avoid long live ranges around function calls.  */
> +	  funexp = copy_to_mode_reg (Pmode, funexp);
> +
> +	  if (REG_P (chain))
> +	    emit_insn (gen_rtx_CLOBBER (VOIDmode, chain));
> +
> +	  /* Emit the runtime identification pattern.  */
> +	  rtx mask = gen_rtx_AND (Pmode, funexp, GEN_INT (bit_val));
> +	  emit_cmp_and_jump_insns (mask, const0_rtx, EQ, NULL_RTX, Pmode, 1,
> +				   call_lab);
> +
> +	  /* Statically predict the branch to very likely taken.  */
> +	  rtx_insn *insn = get_last_insn ();
> +	  if (JUMP_P (insn))
> +	    predict_insn_def (insn, PRED_BUILTIN_EXPECT, TAKEN);
> +
> +	  /* Load the descriptor.  */
> +	  rtx mem = gen_rtx_MEM (Pmode,
> +				 plus_constant (Pmode, funexp, - bit_val));
> +	  MEM_NOTRAP_P (mem) = 1;
> +	  emit_move_insn (chain, mem);
> +	  mem = gen_rtx_MEM (Pmode,
> +			     plus_constant (Pmode, funexp,
> +					    UNITS_PER_WORD - bit_val));
> +	  MEM_NOTRAP_P (mem) = 1;
> +	  emit_move_insn (funexp, mem);
> +
> +	  emit_label (call_lab);
> +
> +	  if (REG_P (chain))
> +	    {
> +	      use_reg (call_fusage, chain);
> +	      STATIC_CHAIN_REG_P (chain) = 1;
> +	    }
> +
> +	  /* Make sure we're not going to be overwritten below.  */
> +	  gcc_assert (!static_chain_value);
> +	}
> +
> +      /* If we are using registers for parameters, force the
> +	 function address into a register now.  */
> +      funexp = ((reg_parm_seen
> +		 && targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
> +		 ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
> +		 : memory_address (FUNCTION_MODE, funexp));
> +    }
> +  else if (!(flags & ECF_SIBCALL))
>      {
>        if (!NO_FUNCTION_CSE && optimize && ! flag_no_function_cse)
>  	funexp = force_reg (Pmode, funexp);
> @@ -211,7 +266,10 @@ prepare_call_address (tree fndecl_or_typ
>  
>        emit_move_insn (chain, static_chain_value);
>        if (REG_P (chain))
> -	use_reg (call_fusage, chain);
> +	{
> +	  use_reg (call_fusage, chain);
> +	  STATIC_CHAIN_REG_P (chain) = 1;
> +	}
>      }
>  
>    return funexp;
> @@ -792,11 +850,13 @@ call_expr_flags (const_tree t)
>      flags = internal_fn_flags (CALL_EXPR_IFN (t));
>    else
>      {
> -      t = TREE_TYPE (CALL_EXPR_FN (t));
> -      if (t && TREE_CODE (t) == POINTER_TYPE)
> -	flags = flags_from_decl_or_type (TREE_TYPE (t));
> +      tree type = TREE_TYPE (CALL_EXPR_FN (t));
> +      if (type && TREE_CODE (type) == POINTER_TYPE)
> +	flags = flags_from_decl_or_type (TREE_TYPE (type));
>        else
>  	flags = 0;
> +      if (CALL_EXPR_BY_DESCRIPTOR (t))
> +	flags |= ECF_BY_DESCRIPTOR;
>      }
>  
>    return flags;
> @@ -2633,6 +2693,8 @@ expand_call (tree exp, rtx target, int i
>      {
>        fntype = TREE_TYPE (TREE_TYPE (addr));
>        flags |= flags_from_decl_or_type (fntype);
> +      if (CALL_EXPR_BY_DESCRIPTOR (exp))
> +	flags |= ECF_BY_DESCRIPTOR;
>      }
>    rettype = TREE_TYPE (exp);
>  
> @@ -3344,6 +3406,13 @@ expand_call (tree exp, rtx target, int i
>        if (STRICT_ALIGNMENT)
>  	store_unaligned_arguments_into_pseudos (args, num_actuals);
>  
> +      /* Prepare the address of the call.  This must be done before any
> +	 register parameters is loaded for find_first_parameter_load to
> +	 work properly in the presence of descriptors.  */
> +      funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp,
> +				     static_chain_value, &call_fusage,
> +				     reg_parm_seen, flags);
> +
>        /* Now store any partially-in-registers parm.
>  	 This is the last place a block-move can happen.  */
>        if (reg_parm_seen)
> @@ -3454,10 +3523,6 @@ expand_call (tree exp, rtx target, int i
>  	}
>  
>        after_args = get_last_insn ();
> -      funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp,
> -				     static_chain_value, &call_fusage,
> -				     reg_parm_seen, pass == 0);
> -
>        load_register_parameters (args, num_actuals, &call_fusage, flags,
>  				pass == 0, &sibcall_failure);
>  
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c	(revision 237789)
> +++ cfgexpand.c	(working copy)
> @@ -2636,6 +2636,7 @@ expand_call_stmt (gcall *stmt)
>    else
>      CALL_FROM_THUNK_P (exp) = gimple_call_from_thunk_p (stmt);
>    CALL_EXPR_VA_ARG_PACK (exp) = gimple_call_va_arg_pack_p (stmt);
> +  CALL_EXPR_BY_DESCRIPTOR (exp) = gimple_call_by_descriptor_p (stmt);
>    SET_EXPR_LOCATION (exp, gimple_location (stmt));
>    CALL_WITH_BOUNDS_P (exp) = gimple_call_with_bounds_p (stmt);
>  
> Index: common.opt
> ===================================================================
> --- common.opt	(revision 237789)
> +++ common.opt	(working copy)
> @@ -2303,6 +2303,10 @@ ftracer
>  Common Report Var(flag_tracer) Optimization
>  Perform superblock formation via tail duplication.
>  
> +ftrampolines
> +Common Report Var(flag_trampolines) Init(0)
> +Always generate trampolines for pointers to nested functions
> +
>  ; Zero means that floating-point math operations cannot generate a
>  ; (user-visible) trap.  This is the case, for example, in nonstop
>  ; IEEE 754 arithmetic.
> Index: config/aarch64/aarch64.h
> ===================================================================
> --- config/aarch64/aarch64.h	(revision 237789)
> +++ config/aarch64/aarch64.h	(working copy)
> @@ -779,6 +779,9 @@ typedef struct
>     correctly.  */
>  #define TRAMPOLINE_SECTION text_section
>  
> +/* Use custom descriptors instead of trampolines when possible.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
> +
>  /* To start with.  */
>  #define BRANCH_COST(SPEED_P, PREDICTABLE_P) \
>    (aarch64_branch_cost (SPEED_P, PREDICTABLE_P))
> Index: config/alpha/alpha.h
> ===================================================================
> --- config/alpha/alpha.h	(revision 237789)
> +++ config/alpha/alpha.h	(working copy)
> @@ -996,3 +996,6 @@ extern long alpha_auto_offset;
>  #define NO_IMPLICIT_EXTERN_C
>  
>  #define TARGET_SUPPORTS_WIDE_INT 1
> +
> +/* Use custom descriptors instead of trampolines when possible if not VMS.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (TARGET_ABI_OPEN_VMS ? 0 : 1)
> Index: config/arm/arm.c
> ===================================================================
> --- config/arm/arm.c	(revision 237789)
> +++ config/arm/arm.c	(working copy)
> @@ -6781,6 +6781,29 @@ arm_function_ok_for_sibcall (tree decl,
>        && DECL_WEAK (decl))
>      return false;
>  
> +  /* We cannot do a tailcall for an indirect call by descriptor if all the
> +     argument registers are used because the only register left to load the
> +     address is IP and it will already contain the static chain.  */
> +  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +    {
> +      tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
> +      CUMULATIVE_ARGS cum;
> +      cumulative_args_t cum_v;
> +
> +      arm_init_cumulative_args (&cum, fntype, NULL_RTX, NULL_TREE);
> +      cum_v = pack_cumulative_args (&cum);
> +
> +      for (tree t = TYPE_ARG_TYPES (fntype); t; t = TREE_CHAIN (t))
> +	{
> +	  tree type = TREE_VALUE (t);
> +	  if (!VOID_TYPE_P (type))
> +	    arm_function_arg_advance (cum_v, TYPE_MODE (type), type, true);
> +	}
> +
> +      if (!arm_function_arg (cum_v, SImode, integer_type_node, true))
> +	return false;
> +    }
> +
>    /* Everything else is ok.  */
>    return true;
>  }
> Index: config/arm/arm.h
> ===================================================================
> --- config/arm/arm.h	(revision 237789)
> +++ config/arm/arm.h	(working copy)
> @@ -1632,6 +1632,10 @@ typedef struct
>  
>  /* Alignment required for a trampoline in bits.  */
>  #define TRAMPOLINE_ALIGNMENT  32
> +
> +/* Use custom descriptors instead of trampolines when possible, but
> +   we cannot use bit #0 because it is the ARM/Thumb selection bit.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2
>  
>  /* Addressing modes, and classification of registers for them.  */
>  #define HAVE_POST_INCREMENT   1
> Index: config/i386/i386.h
> ===================================================================
> --- config/i386/i386.h	(revision 237789)
> +++ config/i386/i386.h	(working copy)
> @@ -2660,6 +2660,9 @@ extern void debug_dispatch_window (int);
>  
>  #define TARGET_SUPPORTS_WIDE_INT 1
>  
> +/* Use custom descriptors instead of trampolines when possible.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
> +
>  /*
>  Local variables:
>  version-control: t
> Index: config/ia64/ia64.h
> ===================================================================
> --- config/ia64/ia64.h	(revision 237789)
> +++ config/ia64/ia64.h	(working copy)
> @@ -1714,4 +1714,7 @@ struct GTY(()) machine_function
>  /* Switch on code for querying unit reservations.  */
>  #define CPU_UNITS_QUERY 1
>  
> +/* IA-64 already uses descriptors for its standard calling sequence.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0
> +
>  /* End of ia64.h */
> Index: config/mips/mips.h
> ===================================================================
> --- config/mips/mips.h	(revision 237789)
> +++ config/mips/mips.h	(working copy)
> @@ -3413,3 +3413,6 @@ struct GTY(())  machine_function {
>  #define ENABLE_LD_ST_PAIRS \
>    (TARGET_LOAD_STORE_PAIRS && (TUNE_P5600 || TUNE_I6400) \
>     && !TARGET_MICROMIPS && !TARGET_FIX_24K)
> +
> +/* Use custom descriptors instead of trampolines when possible.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
> Index: config/pa/pa.h
> ===================================================================
> --- config/pa/pa.h	(revision 237789)
> +++ config/pa/pa.h	(working copy)
> @@ -1313,3 +1313,6 @@ do {									     \
>     seven and four instructions, respectively.  */  
>  #define MAX_PCREL17F_OFFSET \
>    (flag_pic ? (TARGET_HPUX ? 198164 : 221312) : 240000)
> +
> +/* HP-PA already uses descriptors for its standard calling sequence.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0
> Index: config/rs6000/rs6000.h
> ===================================================================
> --- config/rs6000/rs6000.h	(revision 237789)
> +++ config/rs6000/rs6000.h	(working copy)
> @@ -2894,3 +2894,6 @@ extern GTY(()) tree rs6000_builtin_types
>  extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
>  
>  #define TARGET_SUPPORTS_WIDE_INT 1
> +
> +/* Use custom descriptors instead of trampolines when possible if not AIX.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (DEFAULT_ABI == ABI_AIX ? 0 : 1)
> Index: config/sparc/sparc.h
> ===================================================================
> --- config/sparc/sparc.h	(revision 237789)
> +++ config/sparc/sparc.h	(working copy)
> @@ -1817,3 +1817,6 @@ extern int sparc_indent_opcode;
>  #define SPARC_LOW_FE_EXCEPT_VALUES 0
>  
>  #define TARGET_SUPPORTS_WIDE_INT 1
> +
> +/* Use custom descriptors instead of trampolines when possible.  */
> +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
> Index: defaults.h
> ===================================================================
> --- defaults.h	(revision 237789)
> +++ defaults.h	(working copy)
> @@ -1080,9 +1080,18 @@ see the files COPYING3 and COPYING.RUNTI
>  #define CASE_VECTOR_PC_RELATIVE 0
>  #endif
>  
> +/* Force minimum alignment to be able to use the least significant bits
> +   for distinguishing descriptor addresses from code addresses.  */
> +#define DEFAULT_FUNCTION_ALIGNMENT					\
> +  (lang_hooks.custom_function_descriptors				\
> +   && targetm.calls.custom_function_descriptors > 0			\
> +   ? MAX (FUNCTION_BOUNDARY,						\
> +	  2 * targetm.calls.custom_function_descriptors * BITS_PER_UNIT)\
> +   : FUNCTION_BOUNDARY)
> +
>  /* Assume that trampolines need function alignment.  */
>  #ifndef TRAMPOLINE_ALIGNMENT
> -#define TRAMPOLINE_ALIGNMENT FUNCTION_BOUNDARY
> +#define TRAMPOLINE_ALIGNMENT DEFAULT_FUNCTION_ALIGNMENT
>  #endif
>  
>  /* Register mappings for target machines without register windows.  */
> Index: doc/invoke.texi
> ===================================================================
> --- doc/invoke.texi	(revision 237789)
> +++ doc/invoke.texi	(working copy)
> @@ -498,7 +498,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fverbose-asm  -fpack-struct[=@var{n}]  @gol
>  -fleading-underscore  -ftls-model=@var{model} @gol
>  -fstack-reuse=@var{reuse_level} @gol
> --ftrapv  -fwrapv @gol
> +-ftrampolines  -ftrapv  -fwrapv @gol
>  -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
>  -fstrict-volatile-bitfields -fsync-libcalls}
>  
> @@ -11546,6 +11546,31 @@ unit, or if @option{-fpic} is not given
>  The default without @option{-fpic} is @samp{initial-exec}; with
>  @option{-fpic} the default is @samp{global-dynamic}.
>  
> +@item -ftrampolines
> +@opindex ftrampolines
> +Always generate trampolines for pointers to nested functions.
> +
> +A trampoline is a small piece of data or code that is created at run
> +time on the stack when the address of a nested function is taken, and
> +is used to call the nested function indirectly.  For some targets, it
> +is made up of data only and thus requires no special treatment.  But,
> +for most targets, it is made up of code and thus requires the stack
> +to be made executable in order for the program to work properly.
> +
> +@option{-fno-trampolines} is enabled by default to let the compiler avoid
> +generating them if it computes that this is safe, on a case by case basis,
> +and replace them with descriptors.  Descriptors are always made up of data
> +only, but the generated code must be prepared to deal with them.
> +
> +This option has no effects for any other languages than Ada as of this
> +writing.  Moreover, code compiled with @option{-ftrampolines} and code
> +compiled with @option{-fno-trampolines} are not binary compatible if
> +nested functions are present.  This option must therefore be used on
> +a program-wide basis and be manipulated with extreme care.
> +
> +This option has no effects for targets whose trampolines are made up of
> +data only, for example IA-64 targets, AIX or VMS platforms.
> +
>  @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
>  @opindex fvisibility
>  Set the default ELF image symbol visibility to the specified option---all
> Index: doc/tm.texi
> ===================================================================
> --- doc/tm.texi	(revision 237789)
> +++ doc/tm.texi	(working copy)
> @@ -5181,6 +5181,25 @@ be returned; otherwise @var{addr} should
>  If this hook is not defined, @var{addr} will be used for function calls.
>  @end deftypefn
>  
> +@deftypevr {Target Hook} int TARGET_CUSTOM_FUNCTION_DESCRIPTORS
> +This hook should be defined to a power of 2 if the target will benefit
> +from the use of custom descriptors for nested functions instead of the
> +standard trampolines.  Such descriptors are created at run time on the
> +stack and made up of data only, but they are non-standard so the generated
> +code must be prepared to deal with them.  This hook should be defined to 0
> +if the target uses function descriptors for its standard calling sequence,
> +like for example HP-PA or IA-64.  Using descriptors for nested functions
> +eliminates the need for trampolines that reside on the stack and require
> +it to be made executable.
> +
> +The value of the macro is used to parameterize the run-time identification
> +scheme implemented to distinguish descriptors from function addresses: it
> +gives the number of bytes by which their address is shifted in comparison
> +with function addresses.  The value of 1 will generally work, unless it is
> +already used by the target for a similar purpose, like for example on ARM
> +where it is used to distinguish Thumb functions from ARM ones.
> +@end deftypevr
> +
>  Implementing trampolines is difficult on many machines because they have
>  separate instruction and data caches.  Writing into a stack location
>  fails to clear the memory in the instruction cache, so when the program
> Index: doc/tm.texi.in
> ===================================================================
> --- doc/tm.texi.in	(revision 237789)
> +++ doc/tm.texi.in	(working copy)
> @@ -3947,6 +3947,8 @@ is used for aligning trampolines.
>  
>  @hook TARGET_TRAMPOLINE_ADJUST_ADDRESS
>  
> +@hook TARGET_CUSTOM_FUNCTION_DESCRIPTORS
> +
>  Implementing trampolines is difficult on many machines because they have
>  separate instruction and data caches.  Writing into a stack location
>  fails to clear the memory in the instruction cache, so when the program
> Index: gimple.c
> ===================================================================
> --- gimple.c	(revision 237789)
> +++ gimple.c	(working copy)
> @@ -373,6 +373,7 @@ gimple_build_call_from_tree (tree t)
>      gimple_call_set_from_thunk (call, CALL_FROM_THUNK_P (t));
>    gimple_call_set_va_arg_pack (call, CALL_EXPR_VA_ARG_PACK (t));
>    gimple_call_set_nothrow (call, TREE_NOTHROW (t));
> +  gimple_call_set_by_descriptor (call, CALL_EXPR_BY_DESCRIPTOR (t));
>    gimple_set_no_warning (call, TREE_NO_WARNING (t));
>    gimple_call_set_with_bounds (call, CALL_WITH_BOUNDS_P (t));
>  
> @@ -1386,6 +1387,9 @@ gimple_call_flags (const gimple *stmt)
>    if (stmt->subcode & GF_CALL_NOTHROW)
>      flags |= ECF_NOTHROW;
>  
> +  if (stmt->subcode & GF_CALL_BY_DESCRIPTOR)
> +    flags |= ECF_BY_DESCRIPTOR;
> +
>    return flags;
>  }
>  
> Index: gimple.h
> ===================================================================
> --- gimple.h	(revision 237789)
> +++ gimple.h	(working copy)
> @@ -146,6 +146,7 @@ enum gf_mask {
>      GF_CALL_CTRL_ALTERING       = 1 << 7,
>      GF_CALL_WITH_BOUNDS 	= 1 << 8,
>      GF_CALL_MUST_TAIL_CALL	= 1 << 9,
> +    GF_CALL_BY_DESCRIPTOR	= 1 << 10,
>      GF_OMP_PARALLEL_COMBINED	= 1 << 0,
>      GF_OMP_PARALLEL_GRID_PHONY = 1 << 1,
>      GF_OMP_TASK_TASKLOOP	= 1 << 0,
> @@ -3357,6 +3358,26 @@ gimple_call_alloca_for_var_p (gcall *s)
>    return (s->subcode & GF_CALL_ALLOCA_FOR_VAR) != 0;
>  }
>  
> +/* If BY_DESCRIPTOR_P is true, GIMPLE_CALL S is an indirect call for which
> +   pointers to nested function are descriptors instead of trampolines.  */
> +
> +static inline void
> +gimple_call_set_by_descriptor (gcall  *s, bool by_descriptor_p)
> +{
> +  if (by_descriptor_p)
> +    s->subcode |= GF_CALL_BY_DESCRIPTOR;
> +  else
> +    s->subcode &= ~GF_CALL_BY_DESCRIPTOR;
> +}
> +
> +/* Return true if S is a by-descriptor call.  */
> +
> +static inline bool
> +gimple_call_by_descriptor_p (gcall *s)
> +{
> +  return (s->subcode & GF_CALL_BY_DESCRIPTOR) != 0;
> +}
> +
>  /* Copy all the GF_CALL_* flags from ORIG_CALL to DEST_CALL.  */
>  
>  static inline void
> Index: langhooks-def.h
> ===================================================================
> --- langhooks-def.h	(revision 237789)
> +++ langhooks-def.h	(working copy)
> @@ -120,6 +120,7 @@ extern bool lhd_omp_mappable_type (tree)
>  #define LANG_HOOKS_BLOCK_MAY_FALLTHRU	hook_bool_const_tree_true
>  #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP	false
>  #define LANG_HOOKS_DEEP_UNSHARING	false
> +#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS	false
>  
>  /* Attribute hooks.  */
>  #define LANG_HOOKS_ATTRIBUTE_TABLE		NULL
> @@ -319,7 +320,8 @@ extern void lhd_end_section (void);
>    LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
>    LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
>    LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
> -  LANG_HOOKS_DEEP_UNSHARING \
> +  LANG_HOOKS_DEEP_UNSHARING, \
> +  LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS \
>  }
>  
>  #endif /* GCC_LANG_HOOKS_DEF_H */
> Index: langhooks.h
> ===================================================================
> --- langhooks.h	(revision 237789)
> +++ langhooks.h	(working copy)
> @@ -505,6 +505,10 @@ struct lang_hooks
>       gimplification.  */
>    bool deep_unsharing;
>  
> +  /* True if this language may use custom descriptors for nested functions
> +     instead of trampolines.  */
> +  bool custom_function_descriptors;
> +
>    /* Whenever you add entries here, make sure you adjust langhooks-def.h
>       and langhooks.c accordingly.  */
>  };
> Index: rtl.h
> ===================================================================
> --- rtl.h	(revision 237789)
> +++ rtl.h	(working copy)
> @@ -317,6 +317,7 @@ struct GTY((desc("0"), tag("0"),
>       1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c.
>       1 in a VALUE is SP_BASED_VALUE_P in cselib.c.
>       1 in a SUBREG generated by LRA for reload insns.
> +     1 in a REG if this is a static chain register.
>       1 in a CALL for calls instrumented by Pointer Bounds Checker.  */
>    unsigned int jump : 1;
>    /* In a CODE_LABEL, part of the two-bit alternate entry field.
> @@ -2264,6 +2265,10 @@ do {								        \
>   : (SIGN) == SRP_SIGNED ? SUBREG_PROMOTED_SIGNED_P (RTX)		\
>   : SUBREG_PROMOTED_UNSIGNED_P (RTX))
>  
> +/* True if the REG is the static chain register for some CALL_INSN.  */
> +#define STATIC_CHAIN_REG_P(RTX)	\
> +  (RTL_FLAG_CHECK1 ("STATIC_CHAIN_REG_P", (RTX), REG)->jump)
> +
>  /* True if the subreg was generated by LRA for reload insns.  Such
>     subregs are valid only during LRA.  */
>  #define LRA_SUBREG_P(RTX)	\
> Index: rtlanal.c
> ===================================================================
> --- rtlanal.c	(revision 237789)
> +++ rtlanal.c	(working copy)
> @@ -3914,7 +3914,8 @@ find_first_parameter_load (rtx_insn *cal
>    parm.nregs = 0;
>    for (p = CALL_INSN_FUNCTION_USAGE (call_insn); p; p = XEXP (p, 1))
>      if (GET_CODE (XEXP (p, 0)) == USE
> -	&& REG_P (XEXP (XEXP (p, 0), 0)))
> +	&& REG_P (XEXP (XEXP (p, 0), 0))
> +	&& !STATIC_CHAIN_REG_P (XEXP (XEXP (p, 0), 0)))
>        {
>  	gcc_assert (REGNO (XEXP (XEXP (p, 0), 0)) < FIRST_PSEUDO_REGISTER);
>  
> Index: target.def
> ===================================================================
> --- target.def	(revision 237789)
> +++ target.def	(working copy)
> @@ -4723,6 +4723,26 @@ be returned; otherwise @var{addr} should
>  If this hook is not defined, @var{addr} will be used for function calls.",
>   rtx, (rtx addr), NULL)
>  
> +DEFHOOKPOD
> +(custom_function_descriptors,
> + "This hook should be defined to a power of 2 if the target will benefit\n\
> +from the use of custom descriptors for nested functions instead of the\n\
> +standard trampolines.  Such descriptors are created at run time on the\n\
> +stack and made up of data only, but they are non-standard so the generated\n\
> +code must be prepared to deal with them.  This hook should be defined to 0\n\
> +if the target uses function descriptors for its standard calling sequence,\n\
> +like for example HP-PA or IA-64.  Using descriptors for nested functions\n\
> +eliminates the need for trampolines that reside on the stack and require\n\
> +it to be made executable.\n\
> +\n\
> +The value of the macro is used to parameterize the run-time identification\n\
> +scheme implemented to distinguish descriptors from function addresses: it\n\
> +gives the number of bytes by which their address is shifted in comparison\n\
> +with function addresses.  The value of 1 will generally work, unless it is\n\
> +already used by the target for a similar purpose, like for example on ARM\n\
> +where it is used to distinguish Thumb functions from ARM ones.",
> + int, -1)
> +
>  /* Return the number of bytes of its own arguments that a function
>     pops on returning, or 0 if the function pops no arguments and the
>     caller must therefore pop them all after the function returns.  */
> Index: testsuite/gnat.dg/trampoline3.adb
> ===================================================================
> --- testsuite/gnat.dg/trampoline3.adb	(revision 0)
> +++ testsuite/gnat.dg/trampoline3.adb	(working copy)
> @@ -0,0 +1,22 @@
> +-- { dg-do compile { target *-*-linux* } }
> +-- { dg-options "-gnatws" }
> +
> +procedure Trampoline3 is
> +
> +  A : Integer;
> +
> +  type FuncPtr is access function (I : Integer) return Integer;
> +
> +  function F (I : Integer) return Integer is
> +  begin
> +    return A + I;
> +  end F;
> +
> +  P : FuncPtr := F'Access;
> +  I : Integer;
> +
> +begin
> +  I := P(0);
> +end;
> +
> +-- { dg-final { scan-assembler-not "GNU-stack.*x" } }
> Index: testsuite/gnat.dg/trampoline4.adb
> ===================================================================
> --- testsuite/gnat.dg/trampoline4.adb	(revision 0)
> +++ testsuite/gnat.dg/trampoline4.adb	(working copy)
> @@ -0,0 +1,22 @@
> +-- { dg-do compile { target *-*-linux* } }
> +-- { dg-options "-ftrampolines -gnatws" }
> +
> +procedure Trampoline4 is
> +
> +  A : Integer;
> +
> +  type FuncPtr is access function (I : Integer) return Integer;
> +
> +  function F (I : Integer) return Integer is
> +  begin
> +    return A + I;
> +  end F;
> +
> +  P : FuncPtr := F'Access;
> +  I : Integer;
> +
> +begin
> +  I := P(0);
> +end;
> +
> +-- { dg-final { scan-assembler "GNU-stack.*x" } }
> Index: tree-core.h
> ===================================================================
> --- tree-core.h	(revision 237789)
> +++ tree-core.h	(working copy)
> @@ -90,6 +90,9 @@ struct die_struct;
>  /* Nonzero if this call is into the transaction runtime library.  */
>  #define ECF_TM_BUILTIN		  (1 << 13)
>  
> +/* Nonzero if this is an indirect call by descriptor.  */
> +#define ECF_BY_DESCRIPTOR	  (1 << 14)
> +
>  /* Call argument flags.  */
>  /* Nonzero if the argument is not dereferenced recursively, thus only
>     directly reachable memory is read or written.  */
> @@ -1216,6 +1219,12 @@ struct GTY(()) tree_base {
>  
>         REF_REVERSE_STORAGE_ORDER in
>             BIT_FIELD_REF, MEM_REF
> +
> +       FUNC_ADDR_BY_DESCRIPTOR in
> +           ADDR_EXPR
> +
> +       CALL_EXPR_BY_DESCRIPTOR in
> +           CALL_EXPR
>  */
>  
>  struct GTY(()) tree_typed {
> Index: tree-nested.c
> ===================================================================
> --- tree-nested.c	(revision 237789)
> +++ tree-nested.c	(working copy)
> @@ -21,6 +21,7 @@
>  #include "system.h"
>  #include "coretypes.h"
>  #include "backend.h"
> +#include "target.h"
>  #include "rtl.h"
>  #include "tree.h"
>  #include "gimple.h"
> @@ -103,6 +104,7 @@ struct nesting_info
>  
>    bool any_parm_remapped;
>    bool any_tramp_created;
> +  bool any_descr_created;
>    char static_chain_added;
>  };
>  
> @@ -486,12 +488,40 @@ get_trampoline_type (struct nesting_info
>    return trampoline_type;
>  }
>  
> -/* Given DECL, a nested function, find or create a field in the non-local
> -   frame structure for a trampoline for this function.  */
> +/* Build or return the type used to represent a nested function descriptor.  */
> +
> +static GTY(()) tree descriptor_type;
>  
>  static tree
> -lookup_tramp_for_decl (struct nesting_info *info, tree decl,
> -		       enum insert_option insert)
> +get_descriptor_type (struct nesting_info *info)
> +{
> +  tree t;
> +
> +  if (descriptor_type)
> +    return descriptor_type;
> +
> +  t = build_index_type (build_int_cst (NULL_TREE, 2 * UNITS_PER_WORD - 1));
> +  t = build_array_type (char_type_node, t);
> +  t = build_decl (DECL_SOURCE_LOCATION (info->context),
> +		  FIELD_DECL, get_identifier ("__data"), t);
> +  SET_DECL_ALIGN (t, BITS_PER_WORD);
> +  DECL_USER_ALIGN (t) = 1;
> +
> +  descriptor_type = make_node (RECORD_TYPE);
> +  TYPE_NAME (descriptor_type) = get_identifier ("__builtin_descriptor");
> +  TYPE_FIELDS (descriptor_type) = t;
> +  layout_type (descriptor_type);
> +  DECL_CONTEXT (t) = descriptor_type;
> +
> +  return descriptor_type;
> +}
> +
> +/* Given DECL, a nested function, find or create an element in the
> +   var map for this function.  */
> +
> +static tree
> +lookup_element_for_decl (struct nesting_info *info, tree decl,
> +			 enum insert_option insert)
>  {
>    if (insert == NO_INSERT)
>      {
> @@ -501,19 +531,73 @@ lookup_tramp_for_decl (struct nesting_in
>  
>    tree *slot = &info->var_map->get_or_insert (decl);
>    if (!*slot)
> -    {
> -      tree field = make_node (FIELD_DECL);
> -      DECL_NAME (field) = DECL_NAME (decl);
> -      TREE_TYPE (field) = get_trampoline_type (info);
> -      TREE_ADDRESSABLE (field) = 1;
> +    *slot = build_tree_list (NULL_TREE, NULL_TREE);
>  
> -      insert_field_into_struct (get_frame_type (info), field);
> -      *slot = field;
> +  return (tree) *slot;
> +} 
> +
> +/* Given DECL, a nested function, create a field in the non-local
> +   frame structure for this function.  */
> +
> +static tree
> +create_field_for_decl (struct nesting_info *info, tree decl, tree type)
> +{
> +  tree field = make_node (FIELD_DECL);
> +  DECL_NAME (field) = DECL_NAME (decl);
> +  TREE_TYPE (field) = type;
> +  TREE_ADDRESSABLE (field) = 1;
> +  insert_field_into_struct (get_frame_type (info), field);
> +  return field;
> +}
> +
> +/* Given DECL, a nested function, find or create a field in the non-local
> +   frame structure for a trampoline for this function.  */
> +
> +static tree
> +lookup_tramp_for_decl (struct nesting_info *info, tree decl,
> +		       enum insert_option insert)
> +{
> +  tree elt, field;
> +
> +  elt = lookup_element_for_decl (info, decl, insert);
> +  if (!elt)
> +    return NULL_TREE;
> +
> +  field = TREE_PURPOSE (elt);
>  
> +  if (!field && insert == INSERT)
> +    {
> +      field = create_field_for_decl (info, decl, get_trampoline_type (info));
> +      TREE_PURPOSE (elt) = field;
>        info->any_tramp_created = true;
>      }
>  
> -  return *slot;
> +  return field;
> +}
> +
> +/* Given DECL, a nested function, find or create a field in the non-local
> +   frame structure for a descriptor for this function.  */
> +
> +static tree
> +lookup_descr_for_decl (struct nesting_info *info, tree decl,
> +		       enum insert_option insert)
> +{
> +  tree elt, field;
> +
> +  elt = lookup_element_for_decl (info, decl, insert);
> +  if (!elt)
> +    return NULL_TREE;
> +
> +  field = TREE_VALUE (elt);
> +
> +  if (!field && insert == INSERT)
> +    {
> +      field = create_field_for_decl (info, decl, get_descriptor_type (info));
> +      TREE_VALUE (elt) = field;
> +      info->any_descr_created = true;
> +    }
> +
> +  return field;
>  }
>  
>  /* Build or return the field within the non-local frame state that holds
> @@ -2303,6 +2387,7 @@ convert_tramp_reference_op (tree *tp, in
>    struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
>    struct nesting_info *const info = (struct nesting_info *) wi->info, *i;
>    tree t = *tp, decl, target_context, x, builtin;
> +  bool descr;
>    gcall *call;
>  
>    *walk_subtrees = 0;
> @@ -2337,7 +2422,14 @@ convert_tramp_reference_op (tree *tp, in
>  	 we need to insert the trampoline.  */
>        for (i = info; i->context != target_context; i = i->outer)
>  	continue;
> -      x = lookup_tramp_for_decl (i, decl, INSERT);
> +
> +      /* Decide whether to generate a descriptor or a trampoline. */
> +      descr = FUNC_ADDR_BY_DESCRIPTOR (t) && !flag_trampolines;
> +
> +      if (descr)
> +	x = lookup_descr_for_decl (i, decl, INSERT);
> +      else
> +	x = lookup_tramp_for_decl (i, decl, INSERT);
>  
>        /* Compute the address of the field holding the trampoline.  */
>        x = get_frame_field (info, target_context, x, &wi->gsi);
> @@ -2346,7 +2438,10 @@ convert_tramp_reference_op (tree *tp, in
>  
>        /* Do machine-specific ugliness.  Normally this will involve
>  	 computing extra alignment, but it can really be anything.  */
> -      builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
> +      if (descr)
> +	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
> +      else
> +	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
>        call = gimple_build_call (builtin, 1, x);
>        x = init_tmp_var_with_call (info, &wi->gsi, call);
>  
> @@ -2820,6 +2915,27 @@ fold_mem_refs (tree *const &e, void *dat
>    return true;
>  }
>  
> +/* Given DECL, a nested function, build an initialization call for FIELD,
> +   the trampoline or descriptor for DECL, using FUNC as the function.  */
> +
> +static gcall *
> +build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
> +		      tree func)
> +{
> +  tree arg1, arg2, arg3, x;
> +
> +  gcc_assert (DECL_STATIC_CHAIN (decl));
> +  arg3 = build_addr (info->frame_decl);
> +
> +  arg2 = build_addr (decl);
> +
> +  x = build3 (COMPONENT_REF, TREE_TYPE (field),
> +	      info->frame_decl, field, NULL_TREE);
> +  arg1 = build_addr (x);
> +
> +  return gimple_build_call (func, 3, arg1, arg2, arg3);
> +}
> +
>  /* Do "everything else" to clean up or complete state collected by the various
>     walking passes -- create a field to hold the frame base address, lay out the
>     types and decls, generate code to initialize the frame decl, store critical
> @@ -2965,23 +3081,32 @@ finalize_nesting_tree_1 (struct nesting_
>        struct nesting_info *i;
>        for (i = root->inner; i ; i = i->next)
>  	{
> -	  tree arg1, arg2, arg3, x, field;
> +	  tree field, x;
>  
>  	  field = lookup_tramp_for_decl (root, i->context, NO_INSERT);
>  	  if (!field)
>  	    continue;
>  
> -	  gcc_assert (DECL_STATIC_CHAIN (i->context));
> -	  arg3 = build_addr (root->frame_decl);
> +	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
> +	  stmt = build_init_call_stmt (root, i->context, field, x);
> +	  gimple_seq_add_stmt (&stmt_list, stmt);
> +	}
> +    }
>  
> -	  arg2 = build_addr (i->context);
> +  /* If descriptors were created, then we need to initialize them.  */
> +  if (root->any_descr_created)
> +    {
> +      struct nesting_info *i;
> +      for (i = root->inner; i ; i = i->next)
> +	{
> +	  tree field, x;
>  
> -	  x = build3 (COMPONENT_REF, TREE_TYPE (field),
> -		      root->frame_decl, field, NULL_TREE);
> -	  arg1 = build_addr (x);
> +	  field = lookup_descr_for_decl (root, i->context, NO_INSERT);
> +	  if (!field)
> +	    continue;
>  
> -	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
> -	  stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
> +	  x = builtin_decl_implicit (BUILT_IN_INIT_DESCRIPTOR);
> +	  stmt = build_init_call_stmt (root, i->context, field, x);
>  	  gimple_seq_add_stmt (&stmt_list, stmt);
>  	}
>      }
> Index: tree.c
> ===================================================================
> --- tree.c	(revision 237789)
> +++ tree.c	(working copy)
> @@ -1019,7 +1019,7 @@ make_node_stat (enum tree_code code MEM_
>  	{
>  	  if (code == FUNCTION_DECL)
>  	    {
> -	      SET_DECL_ALIGN (t, FUNCTION_BOUNDARY);
> +	      SET_DECL_ALIGN (t, DEFAULT_FUNCTION_ALIGNMENT);
>  	      DECL_MODE (t) = FUNCTION_MODE;
>  	    }
>  	  else
> @@ -10567,12 +10567,19 @@ build_common_builtin_nodes (void)
>  			BUILT_IN_INIT_HEAP_TRAMPOLINE,
>  			"__builtin_init_heap_trampoline",
>  			ECF_NOTHROW | ECF_LEAF);
> +  local_define_builtin ("__builtin_init_descriptor", ftype,
> +			BUILT_IN_INIT_DESCRIPTOR,
> +			"__builtin_init_descriptor", ECF_NOTHROW | ECF_LEAF);
>  
>    ftype = build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE);
>    local_define_builtin ("__builtin_adjust_trampoline", ftype,
>  			BUILT_IN_ADJUST_TRAMPOLINE,
>  			"__builtin_adjust_trampoline",
>  			ECF_CONST | ECF_NOTHROW);
> +  local_define_builtin ("__builtin_adjust_descriptor", ftype,
> +			BUILT_IN_ADJUST_DESCRIPTOR,
> +			"__builtin_adjust_descriptor",
> +			ECF_CONST | ECF_NOTHROW);
>  
>    ftype = build_function_type_list (void_type_node,
>  				    ptr_type_node, ptr_type_node, NULL_TREE);
> Index: tree.h
> ===================================================================
> --- tree.h	(revision 237789)
> +++ tree.h	(working copy)
> @@ -967,6 +967,16 @@ extern void omp_clause_range_check_faile
>  #define REF_REVERSE_STORAGE_ORDER(NODE) \
>    (TREE_CHECK2 (NODE, BIT_FIELD_REF, MEM_REF)->base.default_def_flag)
>  
> +  /* In an ADDR_EXPR, indicates that this is a pointer to nested function
> +   represented by a descriptor instead of a trampoline.  */
> +#define FUNC_ADDR_BY_DESCRIPTOR(NODE) \
> +  (TREE_CHECK (NODE, ADDR_EXPR)->base.default_def_flag)
> +
> +/* In a CALL_EXPR, indicates that this is an indirect call for which
> +   pointers to nested function are descriptors instead of trampolines.  */
> +#define CALL_EXPR_BY_DESCRIPTOR(NODE) \
> +  (TREE_CHECK (NODE, CALL_EXPR)->base.default_def_flag)
> +
>  /* These flags are available for each language front end to use internally.  */
>  #define TREE_LANG_FLAG_0(NODE) \
>    (TREE_NOT_CHECK2 (NODE, TREE_VEC, SSA_NAME)->base.u.bits.lang_flag_0)
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]