[PATCH 1/4] Generate off-stack nested function trampolines

Maxim Blinov maxim.blinov@embecosm.com
Sat Nov 13 09:45:22 GMT 2021


Add support for allocating nested function trampolines on an
executable heap rather than on the stack. This is motivated by targets
such as AArch64 Darwin, which globally prohibit executing code on the
stack.

The target-specific routines for allocating and writing trampolines is
to be provided in libgcc, and is by-default _not_ compiled in unless
the target specifically requires it, or you manually provide
--enable-off-stack-trampolines when configuring gcc/libgcc.

The gcc flag -foff-stack-trampolines controls whether to generate code
that instantiates trampolines on the stack, or to emit calls to
__builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted. Note that this flag is completely
independent of libgcc: If libgcc is for any reason missing those
symbols, you will get a link failure.

This implementation imposes some implicit restrictions as compared to
stack trampolines. longjmp'ing back to a state before a trampoline was
created will cause us to skip over the corresponding
__builtin_nested_func_ptr_deleted, which will leak trampolines
starting from the beginning of the linked list of allocated
trampolines. There may be scope for instrumenting longjmp/setjmp to
trigger cleanups of trampolines.

Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>

gcc/ChangeLog:

        * builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
        (BUILT_IN_NESTED_PTR_DELETED): Ditto.
        * common.opt (foff-stack-trampolines): Add flag to control
        generation of heap-based trampoline instantiation.
        * tree-nested.c (convert_tramp_reference_op): Don't bother calling
        __builtin_adjust_trampoline for the off-stack case.
        (finalize_nesting_tree_1): Emit calls to
        __builtin_nested_...{created,deleted} if we're generating with
        -foff-stack-trampolines.
        * tree.c (build_common_builtin_nodes): Build
        __builtin_nested_...{created,deleted}.
	* dov/invoke.texi (-foff-stack-trampolines): Document.

libgcc/ChangeLog:

	* configure.ac: Add configure parameter
        --enable-off-stack-trampolines, and do error checking if we've
        trying to enable off-stack trampolines for a platform that doesn't
        provide any such implementation.
	* configure: Regenerate.
	* libgcc-std.ver.in: Ditto.
	* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
        (__builtin_nested_func_ptr_deleted): Ditto.
---
 gcc/builtins.def         |   2 +
 gcc/common.opt           |   4 ++
 gcc/config.gcc           |   7 +++
 gcc/doc/invoke.texi      |  14 +++++
 gcc/tree-nested.c        | 121 +++++++++++++++++++++++++++++++++------
 gcc/tree.c               |  17 ++++++
 libgcc/configure         |  26 +++++++++
 libgcc/configure.ac      |  17 ++++++
 libgcc/libgcc-std.ver.in |   3 +
 libgcc/libgcc2.h         |   3 +
 10 files changed, 197 insertions(+), 17 deletions(-)

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 45a09b4d42d..90a94a6dd0f 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -950,6 +950,8 @@ DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_CREATED, "__builtin_nested_func_ptr_created")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_DELETED, "__builtin_nested_func_ptr_deleted")
 
 /* Implementing __builtin_setjmp.  */
 DEF_BUILTIN_STUB (BUILT_IN_SETJMP_SETUP, "__builtin_setjmp_setup")
diff --git a/gcc/common.opt b/gcc/common.opt
index de9b848eda5..a97aeeb2165 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2149,6 +2149,10 @@ foffload-abi=
 Common Joined RejectNegative Enum(offload_abi)
 -foffload-abi=[lp64|ilp32]	Set the ABI to use in an offload compiler.
 
+foff-stack-trampolines
+Common RejectNegative Var(flag_off_stack_trampolines) Init(OFF_STACK_TRAMPOLINES_INIT)
+Generate trampolines in executable memory rather than executable stack.
+
 Enum
 Name(offload_abi) Type(enum offload_abi) UnknownError(unknown offload ABI %qs)
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index edd12655c4a..c479aa4cc44 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1070,6 +1070,13 @@ case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable -foff-stack-trampolines by default.
+case ${target} in
+*)
+  tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 	tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2aba4c70b44..a5db65f8721 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -660,6 +660,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-fcall-saved-@var{reg}  -fcall-used-@var{reg} @gol
 -ffixed-@var{reg}  -fexceptions @gol
 -fnon-call-exceptions  -fdelete-dead-exceptions  -funwind-tables @gol
+-foff-stack-trampolines @gol
 -fasynchronous-unwind-tables @gol
 -fno-gnu-unique @gol
 -finhibit-size-directive  -fcommon  -fno-ident @gol
@@ -16683,6 +16684,19 @@ instructions.  It does not allow exceptions to be thrown from
 arbitrary signal handlers such as @code{SIGALRM}.  This enables
 @option{-fexceptions}.
 
+@item -foff-stack-trampolines
+@opindex foff-stack-trampolines
+Certain platforms (such as the Apple M1) do not permit an executable
+stack. Generate calls to @code{__builtin_nested_func_ptr_created} and
+@code{__builtin_nested_func_ptr_deleted} in order to allocate and
+deallocate trampoline space on the executable heap. Please note that
+these functions are implemented in libgcc, and will not be compiled in
+unless you provide @option{--enable-off-stack-trampolines} when
+building gcc.  @emph{PLEASE NOTE}: The trampolines are @emph{not}
+guaranteed to be correctly deallocated if you @code{setjmp},
+instantiate nested functions, and then @code{longjmp} back to a state
+prior to having allocated those nested functions.
+
 @item -fdelete-dead-exceptions
 @opindex fdelete-dead-exceptions
 Consider that instructions that may throw exceptions but don't otherwise
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index c7f50ebd21c..a405c905e1d 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -611,6 +611,14 @@ get_trampoline_type (struct nesting_info *info)
   if (trampoline_type)
     return trampoline_type;
 
+  /* When trampolines are created off-stack then the only thing we need in the
+     local frame is a single pointer.  */
+  if (flag_off_stack_trampolines)
+    {
+      trampoline_type = build_pointer_type (void_type_node);
+      return trampoline_type;
+    }
+
   align = TRAMPOLINE_ALIGNMENT;
   size = TRAMPOLINE_SIZE;
 
@@ -2784,17 +2792,27 @@ convert_tramp_reference_op (tree *tp, int *walk_subtrees, void *data)
 
       /* Compute the address of the field holding the trampoline.  */
       x = get_frame_field (info, target_context, x, &wi->gsi);
-      x = build_addr (x);
-      x = gsi_gimplify_val (info, x, &wi->gsi);
 
-      /* Do machine-specific ugliness.  Normally this will involve
-	 computing extra alignment, but it can really be anything.  */
-      if (descr)
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+      /* APB: We don't need to do the adjustment calls when using off-stack
+	 trampolines, any such adjustment will be done when the off-stack
+	 trampoline is created.  */
+      if (flag_off_stack_trampolines)
+	x = gsi_gimplify_val (info, x, &wi->gsi);
       else
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
-      call = gimple_build_call (builtin, 1, x);
-      x = init_tmp_var_with_call (info, &wi->gsi, call);
+	{
+	  x = build_addr (x);
+
+	  x = gsi_gimplify_val (info, x, &wi->gsi);
+
+	  /* Do machine-specific ugliness.  Normally this will involve
+	     computing extra alignment, but it can really be anything.  */
+	  if (descr)
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+	  else
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
+	  call = gimple_build_call (builtin, 1, x);
+	  x = init_tmp_var_with_call (info, &wi->gsi, call);
+	}
 
       /* Cast back to the proper function type.  */
       x = build1 (NOP_EXPR, TREE_TYPE (t), x);
@@ -3373,6 +3391,7 @@ build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
 static void
 finalize_nesting_tree_1 (struct nesting_info *root)
 {
+  gimple_seq cleanup_list = NULL;
   gimple_seq stmt_list = NULL;
   gimple *stmt;
   tree context = root->context;
@@ -3504,9 +3523,48 @@ finalize_nesting_tree_1 (struct nesting_info *root)
 	  if (!field)
 	    continue;
 
-	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
-	  stmt = build_init_call_stmt (root, i->context, field, x);
-	  gimple_seq_add_stmt (&stmt_list, stmt);
+	  if (flag_off_stack_trampolines)
+	    {
+	      /* We pass a whole bunch of arguments to the builtin function that
+		 creates the off-stack trampoline, these are
+		 1. The nested function chain value (that must be passed to the
+		 nested function so it can find the function arguments).
+		 2. A pointer to the nested function implementation,
+		 3. The address in the local stack frame where we should write
+		 the address of the trampoline.
+
+		 When this code was originally written I just kind of threw
+		 everything at the builtin, figuring I'd work out what was
+		 actually needed later, I think, the stack pointer could
+		 certainly be dropped, arguments #2 and #4 are based off the
+		 stack pointer anyway, so #1 doesn't seem to add much value.  */
+	      tree arg1, arg2, arg3;
+
+	      gcc_assert (DECL_STATIC_CHAIN (i->context));
+	      arg1 = build_addr (root->frame_decl);
+	      arg2 = build_addr (i->context);
+
+	      x = build3 (COMPONENT_REF, TREE_TYPE (field),
+			  root->frame_decl, field, NULL_TREE);
+	      arg3 = build_addr (x);
+
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_CREATED);
+	      stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+
+	      /* This call to delete the nested function trampoline is added to
+		 the cleanup list, and called when we exit the current scope.  */
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_DELETED);
+	      stmt = gimple_build_call (x, 0);
+	      gimple_seq_add_stmt (&cleanup_list, stmt);
+	    }
+	  else
+	    {
+	      /* Original code to initialise the on stack trampoline.  */
+	      x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
+	      stmt = build_init_call_stmt (root, i->context, field, x);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+	    }
 	}
     }
 
@@ -3531,11 +3589,40 @@ finalize_nesting_tree_1 (struct nesting_info *root)
   /* If we created initialization statements, insert them.  */
   if (stmt_list)
     {
-      gbind *bind;
-      annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
-      bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
-      gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
-      gimple_bind_set_body (bind, stmt_list);
+      if (flag_off_stack_trampolines)
+	{
+	  /* Handle the new, off stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  annotate_all_with_location (cleanup_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+
+	  gimple_seq xxx_list = NULL;
+
+	  if (cleanup_list != NULL)
+	    {
+	      /* We Maybe shouldn't be creating this try/finally if -fno-exceptions is
+		 in use.  If this is the case, then maybe we should, instead, be
+		 inserting the cleanup code onto every path out of this function?  Not
+		 yet figured out how we would do this.  */
+	      gtry *t = gimple_build_try (stmt_list, cleanup_list, GIMPLE_TRY_FINALLY);
+	      gimple_seq_add_stmt (&xxx_list, t);
+	    }
+	  else
+	    xxx_list = stmt_list;
+
+	  gimple_bind_set_body (bind, xxx_list);
+	}
+      else
+	{
+	  /* The traditional, on stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+	  gimple_bind_set_body (bind, stmt_list);
+	}
     }
 
   /* If a chain_decl was created, then it needs to be registered with
diff --git a/gcc/tree.c b/gcc/tree.c
index f2c829fa4c6..968963a3127 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9648,6 +9648,23 @@ build_common_builtin_nodes (void)
 			"__builtin_nonlocal_goto",
 			ECF_NORETURN | ECF_NOTHROW);
 
+  tree ptr_ptr_type_node = build_pointer_type (ptr_type_node);
+
+  ftype = build_function_type_list (void_type_node,
+				    ptr_type_node, // void *chain
+				    ptr_type_node, // void *func
+				    ptr_ptr_type_node, // void **dst
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_created", ftype,
+			BUILT_IN_NESTED_PTR_CREATED,
+			"__builtin_nested_func_ptr_created", ECF_NOTHROW);
+
+  ftype = build_function_type_list (void_type_node,
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_deleted", ftype,
+			BUILT_IN_NESTED_PTR_DELETED,
+			"__builtin_nested_func_ptr_deleted", ECF_NOTHROW);
+
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node, NULL_TREE);
   local_define_builtin ("__builtin_setjmp_setup", ftype,
diff --git a/libgcc/configure b/libgcc/configure
index 4919a56f518..2f469219e07 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -654,6 +654,7 @@ build
 with_aix_soname
 enable_vtable_verify
 enable_gcov
+off_stack_trampolines
 enable_shared
 libgcc_topdir
 target_alias
@@ -701,6 +702,7 @@ with_target_subdir
 with_cross_host
 with_ld
 enable_shared
+enable_off_stack_trampolines
 enable_gcov
 enable_vtable_verify
 with_aix_soname
@@ -1342,6 +1344,9 @@ Optional Features:
   --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
   --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
   --disable-shared        don't provide a shared libgcc
+  --enable-off-stack-trampolines
+                  Specify whether to support generating off-stack trampolines
+
   --disable-gcov          don't provide libgcov and related host tools
   --enable-vtable-verify    Enable vtable verification feature
   --enable-version-specific-runtime-libs    Specify that runtime libraries should be installed in a compiler-specific directory
@@ -2252,6 +2257,27 @@ fi
 
 
 
+# Check whether --enable-off-stack-trampolines was given.
+if test "${enable_off_stack_trampolines+set}" = set; then :
+  enableval=$enable_off_stack_trampolines;
+case "$target" in
+  *)
+    as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
+for this platform" "$LINENO" 5
+    off_stack_trampolines=no
+    ;;
+esac
+else
+
+case "$target" in
+  *)
+    off_stack_trampolines=no
+    ;;
+esac
+fi
+
+
+
 # Check whether --enable-gcov was given.
 if test "${enable_gcov+set}" = set; then :
   enableval=$enable_gcov;
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 13a80b2551b..97bbd4bd35c 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -68,6 +68,23 @@ AC_ARG_ENABLE(shared,
 ], [enable_shared=yes])
 AC_SUBST(enable_shared)
 
+AC_ARG_ENABLE([off-stack-trampolines],
+  [AS_HELP_STRING([--enable-off-stack-trampolines]
+                  [Specify whether to support generating off-stack trampolines])],[
+case "$target" in
+  *)
+    AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
+for this platform])
+    off_stack_trampolines=no
+    ;;
+esac],[
+case "$target" in
+  *)
+    off_stack_trampolines=no
+    ;;
+esac])
+AC_SUBST(off_stack_trampolines)
+
 AC_ARG_ENABLE(gcov,
 [  --disable-gcov          don't provide libgcov and related host tools],
 [], [enable_gcov=yes])
diff --git a/libgcc/libgcc-std.ver.in b/libgcc/libgcc-std.ver.in
index cea33267e53..f26ad3cdf5d 100644
--- a/libgcc/libgcc-std.ver.in
+++ b/libgcc/libgcc-std.ver.in
@@ -1943,4 +1943,7 @@ GCC_4.8.0 {
 GCC_7.0.0 {
   __PFX__divmoddi4
   __PFX__divmodti4
+
+  __builtin_nested_func_ptr_created
+  __builtin_nested_func_ptr_deleted
 }
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 1819ff3ac3d..1a448c02c04 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -29,6 +29,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #pragma GCC visibility push(default)
 #endif
 
+extern void __builtin_nested_func_ptr_created (void *, void *, void **);
+extern void __builtin_nested_func_ptr_deleted (void);
+
 extern int __gcc_bcmp (const unsigned char *, const unsigned char *, size_t);
 extern void __clear_cache (void *, void *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
-- 
2.30.1 (Apple Git-130)



More information about the Gcc-patches mailing list