[PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

Michael Meissner meissner@linux.vnet.ibm.com
Wed Jul 6 22:39:00 GMT 2011


This patch adds an option to not load the static chain (r11) for 64-bit PowerPC
calls through function pointers (or virtual function).  Most of the languages
on the PowerPC do not need the static chain being loaded when called, and
adding this instruction can slow down code that calls very short functions.

In addition, if the function does not call alloca, setjmp or deal with
exceptions where the stack is modified, the compiler can move the store of the
TOC value for the current function to the prologue of the function, rather than
at each call site.

The effect of these patches is to speed up 464.h264ref in the Spec 2006
benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the
save of the TOC register is hoisted).  I believe this is due to the load of the
current function's TOC (r2) having to wait until the store queue is drained
with the store just before the call.

Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the
cause is.

I have bootstraped the compiler and saw that there were no regressions in make
check.  Is it ok to install in the trunk?

[gcc]
2011-07-06  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_call_indirect_aix): New
	declaration.
	(rs6000_save_toc_in_prologue_p): Ditto.

	* config/rs6000/rs6000.opt (-mr11): New switch to disable loading
	up the static chain (r11) during indirect function calls.
	(-msave-toc-indirect): New undocumented debug switch.

	* config/rs6000/rs6000.c (struct machine_function): Add
	save_toc_in_prologue field to note whether the prologue needs to
	save the TOC value in the reserved stack location.
	(rs6000_emit_prologue): Use TOC_REGNUM instead of 2.  If we need
	to save the TOC in the prologue, do so.
	(rs6000_trampoline_init): Don't allow creating AIX style
	trampolines if -mno-r11 is in effect.
	(rs6000_call_indirect_aix): New function to create AIX style
	indirect calls, adding support for -mno-r11 to suppress loading
	the static chain, and saving the TOC in the prologue instead of
	the call body.
	(rs6000_save_toc_in_prologue_p): Return true if we are saving the
	TOC in the prologue.

	* config/rs6000/rs6000.md (STACK_POINTER_REGNUM): Add more fixed
	register numbers.
	(TOC_REGNUM): Ditto.
	(STATIC_CHAIN_REGNUM): Ditto.
	(ARG_POINTER_REGNUM): Ditto.
	(SFP_REGNO): Delete, unused.
	(TOC_SAVE_OFFSET_32BIT): Add constants for AIX TOC save and
	function descriptor offsets.
	(TOC_SAVE_OFFSET_64BIT): Ditto.
	(AIX_FUNC_DESC_TOC_32BIT): Ditto.
	(AIX_FUNC_DESC_TOC_64BIT): Ditto.
	(AIX_FUNC_DESC_SC_32BIT): Ditto.
	(AIX_FUNC_DESC_SC_64BIT): Ditto.
	(ptrload): New mode attribute for the appropriate load of a
	pointer.
	(call_indirect_aix32): Delete, rewrite AIX indirect function
	calls.
	(call_indirect_aix64): Ditto.
	(call_value_indirect_aix32): Ditto.
	(call_value_indirect_aix64): Ditto.
	(call_indirect_nonlocal_aix32_internal): Ditto.
	(call_indirect_nonlocal_aix32): Ditto.
	(call_indirect_nonlocal_aix64_internal): Ditto.
	(call_indirect_nonlocal_aix64): Ditto.
	(call): Rewrite AIX indirect function calls.  Add support for
	eliminating the static chain, and for moving the save of the TOC
	to the function prologue.
	(call_value): Ditto.
	(call_indirect_aix<ptrsize>): Ditto.
	(call_indirect_aix<ptrsize>_internal): Ditto.
	(call_indirect_aix<ptrsize>_internal2): Ditto.
	(call_indirect_aix<ptrsize>_nor11): Ditto.
	(call_value_indirect_aix<ptrsize>): Ditto.
	(call_value_indirect_aix<ptrsize>_internal): Ditto.
	(call_value_indirect_aix<ptrsize>_internal2): Ditto.
	(call_value_indirect_aix<ptrsize>_nor11): Ditto.
	(call_nonlocal_aix32): Relocate in the rs6000.md file.
	(call_nonlocal_aix64): Ditto.

	* doc/invoke.texi (RS/6000 and PowerPC Options): Add -mr11 and
	-mno-r11 documentation.
[gcc/testsuite]
2011-07-06  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/no-r11-1.c: New test for -mr11, -mno-r11.
	* gcc.target/powerpc/no-r11-2.c: Ditto.
	* gcc.target/powerpc/no-r11-3.c: Ditto.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meissner@linux.vnet.ibm.com	fax +1 (978) 399-6899
-------------- next part --------------
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 175921)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -171,6 +171,8 @@ extern unsigned int rs6000_dbx_register_
 extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
+extern void rs6000_call_indirect_aix (rtx, rtx, rtx);
+extern bool rs6000_save_toc_in_prologue_p (void);
 
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 175921)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -521,4 +521,10 @@ mxilinx-fpu
 Target Var(rs6000_xilinx_fpu) Save
 Specify Xilinx FPU.
 
+mr11
+Target Report Var(TARGET_R11) Init(1) Save
+Use/do not use r11 to hold the static link in calls.
 
+msave-toc-indirect
+Target Undocumented Var(TARGET_SAVE_TOC_INDIRECT) Save Init(1)
+; Control whether we save the TOC in the prologue for indirect calls or generate the save inline
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 175921)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -130,6 +130,9 @@ typedef struct GTY(()) machine_function
   int ra_need_lr;
   /* Cache lr_save_p after expansion of builtin_eh_return.  */
   int lr_save_state;
+  /* Whether we need to save the TOC to the reserved stack location in the
+     function prologue.  */
+  bool save_toc_in_prologue;
   /* Offset from virtual_stack_vars_rtx to the start of the ABI_V4
      varargs save area.  */
   HOST_WIDE_INT varargs_save_offset;
@@ -20325,7 +20328,7 @@ rs6000_emit_prologue (void)
       JUMP_LABEL (jump) = toc_save_done;
       LABEL_NUSES (toc_save_done) += 1;
 
-      emit_frame_save (frame_reg_rtx, frame_ptr_rtx, reg_mode, 2,
+      emit_frame_save (frame_reg_rtx, frame_ptr_rtx, reg_mode, TOC_REGNUM,
 		       sp_offset + 5 * reg_size, info->total_size);
       emit_label (toc_save_done);
       if (using_static_chain_p)
@@ -20516,6 +20519,11 @@ rs6000_emit_prologue (void)
 	emit_move_insn (lr, gen_rtx_REG (Pmode, 0));
     }
 #endif
+
+  /* If we need to, save the TOC register after doing the stack setup.  */
+  if (rs6000_save_toc_in_prologue_p ())
+    emit_frame_save (sp_reg_rtx, sp_reg_rtx, reg_mode, TOC_REGNUM,
+		     5 * reg_size, info->total_size);
 }
 
 /* Write function prologue.  */
@@ -24469,9 +24477,14 @@ rs6000_trampoline_init (rtx m_tramp, tre
     /* Under AIX, just build the 3 word function descriptor */
     case ABI_AIX:
       {
-	rtx fnmem = gen_const_mem (Pmode, force_reg (Pmode, fnaddr));
-	rtx fn_reg = gen_reg_rtx (Pmode);
-	rtx toc_reg = gen_reg_rtx (Pmode);
+	rtx fnmem, fn_reg, toc_reg;
+
+	if (!TARGET_R11)
+	  error ("-mno-r11 must not be used if you have trampolines");
+
+	fnmem = gen_const_mem (Pmode, force_reg (Pmode, fnaddr));
+	fn_reg = gen_reg_rtx (Pmode);
+	toc_reg = gen_reg_rtx (Pmode);
 
   /* Macro to shorten the code expansions below.  */
 # define MEM_PLUS(MEM, OFFSET) adjust_address (MEM, Pmode, OFFSET)
@@ -27760,4 +27773,132 @@ rs6000_legitimate_constant_p (enum machi
 	  || easy_vector_constant (x, mode));
 }
 
+
+/* A function pointer under AIX is a pointer to a data area whose first word
+   contains the actual address of the function, whose second word contains a
+   pointer to its TOC, and whose third word contains a value to place in the
+   static chain register (r11).  Note that if we load the static chain, our
+   "trampoline" need not have any executable code.  */
+
+void
+rs6000_call_indirect_aix (rtx value, rtx func_desc, rtx flag)
+{
+  rtx func_addr;
+  rtx toc_reg;
+  rtx sc_reg;
+  rtx stack_ptr;
+  rtx stack_toc_offset;
+  rtx stack_toc_mem;
+  rtx func_toc_offset;
+  rtx func_toc_mem;
+  rtx func_sc_offset;
+  rtx func_sc_mem;
+  rtx insn;
+  rtx (*call_func) (rtx, rtx, rtx, rtx);
+  rtx (*call_value_func) (rtx, rtx, rtx, rtx, rtx);
+
+  stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
+  toc_reg = gen_rtx_REG (Pmode, TOC_REGNUM);
+
+  /* Load up address of the actual function.  */
+  func_desc = force_reg (Pmode, func_desc);
+  func_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (func_addr, gen_rtx_MEM (Pmode, func_desc));
+
+  if (TARGET_32BIT)
+    {
+
+      stack_toc_offset = GEN_INT (TOC_SAVE_OFFSET_32BIT);
+      func_toc_offset = GEN_INT (AIX_FUNC_DESC_TOC_32BIT);
+      func_sc_offset = GEN_INT (AIX_FUNC_DESC_SC_32BIT);
+      if (TARGET_R11)
+	{
+	  call_func = gen_call_indirect_aix32bit;
+	  call_value_func = gen_call_value_indirect_aix32bit;
+	}
+      else
+	{
+	  call_func = gen_call_indirect_aix32bit_nor11;
+	  call_value_func = gen_call_value_indirect_aix32bit_nor11;
+	}
+    }
+  else
+    {
+      stack_toc_offset = GEN_INT (TOC_SAVE_OFFSET_64BIT);
+      func_toc_offset = GEN_INT (AIX_FUNC_DESC_TOC_64BIT);
+      func_sc_offset = GEN_INT (AIX_FUNC_DESC_SC_64BIT);
+      if (TARGET_R11)
+	{
+	  call_func = gen_call_indirect_aix64bit;
+	  call_value_func = gen_call_value_indirect_aix64bit;
+	}
+      else
+	{
+	  call_func = gen_call_indirect_aix64bit_nor11;
+	  call_value_func = gen_call_value_indirect_aix64bit_nor11;
+	}
+    }
+
+  /* Reserved spot to store the TOC.  */
+  stack_toc_mem = gen_frame_mem (Pmode,
+				 gen_rtx_PLUS (Pmode,
+					       stack_ptr,
+					       stack_toc_offset));
+
+  gcc_assert (cfun);
+  gcc_assert (cfun->machine);
+
+  /* Can we optimize saving the TOC in the prologue or do we need to do it at
+     every call?  */
+  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca
+      && !cfun->calls_setjmp && !cfun->has_nonlocal_label
+      && !cfun->can_throw_non_call_exceptions
+      && ((flags_from_decl_or_type (cfun->decl) & ECF_NOTHROW) == ECF_NOTHROW))
+    cfun->machine->save_toc_in_prologue = true;
+
+  else
+    {
+      MEM_VOLATILE_P (stack_toc_mem) = 1;
+      emit_move_insn (stack_toc_mem, toc_reg);
+    }
+
+  /* Calculate the address to load the TOC of the called function.  We don't
+     actually load this until the split after reload.  */
+  func_toc_mem = gen_rtx_MEM (Pmode,
+			      gen_rtx_PLUS (Pmode,
+					    func_desc,
+					    func_toc_offset));
+
+  /* If we have a static chain, load it up.  */
+  if (TARGET_R11)
+    {
+      func_sc_mem = gen_rtx_MEM (Pmode,
+				 gen_rtx_PLUS (Pmode,
+					       func_desc,
+					       func_sc_offset));
+
+      sc_reg = gen_rtx_REG (Pmode, STATIC_CHAIN_REGNUM);
+      emit_move_insn (sc_reg, func_sc_mem);
+    }
+
+  /* Create the call.  */
+  if (value)
+    insn = call_value_func (value, func_addr, flag, func_toc_mem,
+			    stack_toc_mem);
+  else
+    insn = call_func (func_addr, flag, func_toc_mem, stack_toc_mem);
+
+  emit_call_insn (insn);
+  return;
+}
+
+/* Return whether we need to always update the saved TOC pointer when we update
+   the stack pointer.  */
+
+bool
+rs6000_save_toc_in_prologue_p (void)
+{
+  return (cfun && cfun->machine && cfun->machine->save_toc_in_prologue);
+}
+
 #include "gt-rs6000.h"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 175921)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -27,9 +27,14 @@
 ;;
 
 (define_constants
-  [(MQ_REGNO			64)
+  [(STACK_POINTER_REGNUM	1)
+   (TOC_REGNUM			2)
+   (STATIC_CHAIN_REGNUM		11)
+   (HARD_FRAME_POINTER_REGNUM	31)
+   (MQ_REGNO			64)
    (LR_REGNO			65)
    (CTR_REGNO			66)
+   (ARG_POINTER_REGNUM		67)
    (CR0_REGNO			68)
    (CR1_REGNO			69)
    (CR2_REGNO			70)
@@ -46,7 +51,19 @@ (define_constants
    (VSCR_REGNO			110)
    (SPE_ACC_REGNO		111)
    (SPEFSCR_REGNO		112)
-   (SFP_REGNO			113)
+   (FRAME_POINTER_REGNUM	113)
+
+   ; ABI defined stack offsets for storing the TOC pointer with AIX calls.
+   (TOC_SAVE_OFFSET_32BIT	20)
+   (TOC_SAVE_OFFSET_64BIT	40)
+
+   ; Function TOC offset in the AIX function descriptor.
+   (AIX_FUNC_DESC_TOC_32BIT	4)
+   (AIX_FUNC_DESC_TOC_64BIT	8)
+
+   ; Static chain offset in the AIX function descriptor.
+   (AIX_FUNC_DESC_SC_32BIT	8)
+   (AIX_FUNC_DESC_SC_64BIT	16)
   ])
 
 ;;
@@ -267,6 +284,9 @@ (define_mode_attr tptrsize [(SI "TARGET_
 (define_mode_attr mptrsize [(SI "si")
 			    (DI "di")])
 
+(define_mode_attr ptrload [(SI "{l|lwz}")
+			   (DI "ld")])
+
 (define_mode_attr rreg [(SF   "f")
 			(DF   "ws")
 			(V4SF "wf")
@@ -12178,87 +12198,7 @@ (define_insn "largetoc_low"
    "TARGET_ELF && TARGET_CMODEL != CMODEL_SMALL"
    "{cal %0,%2@l(%1)|addi %0,%1,%2@l}")
 
-;; A function pointer under AIX is a pointer to a data area whose first word
-;; contains the actual address of the function, whose second word contains a
-;; pointer to its TOC, and whose third word contains a value to place in the
-;; static chain register (r11).  Note that if we load the static chain, our
-;; "trampoline" need not have any executable code.
-
-(define_expand "call_indirect_aix32"
-  [(set (match_dup 2)
-	(mem:SI (match_operand:SI 0 "gpc_reg_operand" "")))
-   (set (mem:SI (plus:SI (reg:SI 1) (const_int 20)))
-	(reg:SI 2))
-   (set (reg:SI 11)
-	(mem:SI (plus:SI (match_dup 0)
-			 (const_int 8))))
-   (parallel [(call (mem:SI (match_dup 2))
-		    (match_operand 1 "" ""))
-	      (use (mem:SI (plus:SI (match_dup 0) (const_int 4))))
-	      (use (reg:SI 11))
-	      (use (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-	      (clobber (reg:SI LR_REGNO))])]
-  "TARGET_32BIT"
-  "
-{ operands[2] = gen_reg_rtx (SImode); }")
-
-(define_expand "call_indirect_aix64"
-  [(set (match_dup 2)
-	(mem:DI (match_operand:DI 0 "gpc_reg_operand" "")))
-   (set (mem:DI (plus:DI (reg:DI 1) (const_int 40)))
-	(reg:DI 2))
-   (set (reg:DI 11)
-	(mem:DI (plus:DI (match_dup 0)
-			 (const_int 16))))
-   (parallel [(call (mem:SI (match_dup 2))
-		    (match_operand 1 "" ""))
-	      (use (mem:DI (plus:DI (match_dup 0) (const_int 8))))
-	      (use (reg:DI 11))
-	      (use (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-	      (clobber (reg:SI LR_REGNO))])]
-  "TARGET_64BIT"
-  "
-{ operands[2] = gen_reg_rtx (DImode); }")
-
-(define_expand "call_value_indirect_aix32"
-  [(set (match_dup 3)
-	(mem:SI (match_operand:SI 1 "gpc_reg_operand" "")))
-   (set (mem:SI (plus:SI (reg:SI 1) (const_int 20)))
-	(reg:SI 2))
-   (set (reg:SI 11)
-	(mem:SI (plus:SI (match_dup 1)
-			 (const_int 8))))
-   (parallel [(set (match_operand 0 "" "")
-		   (call (mem:SI (match_dup 3))
-			 (match_operand 2 "" "")))
-	      (use (mem:SI (plus:SI (match_dup 1) (const_int 4))))
-	      (use (reg:SI 11))
-	      (use (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-	      (clobber (reg:SI LR_REGNO))])]
-  "TARGET_32BIT"
-  "
-{ operands[3] = gen_reg_rtx (SImode); }")
-
-(define_expand "call_value_indirect_aix64"
-  [(set (match_dup 3)
-	(mem:DI (match_operand:DI 1 "gpc_reg_operand" "")))
-   (set (mem:DI (plus:DI (reg:DI 1) (const_int 40)))
-	(reg:DI 2))
-   (set (reg:DI 11)
-	(mem:DI (plus:DI (match_dup 1)
-			 (const_int 16))))
-   (parallel [(set (match_operand 0 "" "")
-		   (call (mem:SI (match_dup 3))
-			 (match_operand 2 "" "")))
-	      (use (mem:DI (plus:DI (match_dup 1) (const_int 8))))
-	      (use (reg:DI 11))
-	      (use (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-	      (clobber (reg:SI LR_REGNO))])]
-  "TARGET_64BIT"
-  "
-{ operands[3] = gen_reg_rtx (DImode); }")
-
-;; Now the definitions for the call and call_value insns
+;; Call and call_value insns
 (define_expand "call"
   [(parallel [(call (mem:SI (match_operand 0 "address_operand" ""))
 		    (match_operand 1 "" ""))
@@ -12294,13 +12234,7 @@ (define_expand "call"
 	case ABI_AIX:
 	  /* AIX function pointers are really pointers to a three word
 	     area.  */
-	  emit_call_insn (TARGET_32BIT
-			  ? gen_call_indirect_aix32 (force_reg (SImode,
-							        operands[0]),
-						     operands[1])
-			  : gen_call_indirect_aix64 (force_reg (DImode,
-							        operands[0]),
-						     operands[1]));
+	  rs6000_call_indirect_aix (NULL_RTX, operands[0], operands[1]);
 	  DONE;
 
 	default:
@@ -12345,15 +12279,7 @@ (define_expand "call_value"
 	case ABI_AIX:
 	  /* AIX function pointers are really pointers to a three word
 	     area.  */
-	  emit_call_insn (TARGET_32BIT
-			  ? gen_call_value_indirect_aix32 (operands[0],
-							   force_reg (SImode,
-								      operands[1]),
-							   operands[2])
-			  : gen_call_value_indirect_aix64 (operands[0],
-							   force_reg (DImode,
-								      operands[1]),
-							   operands[2]));
+	  rs6000_call_indirect_aix (operands[0], operands[1], operands[2]);
 	  DONE;
 
 	default:
@@ -12447,149 +12373,202 @@ (define_insn "*call_value_local64"
   [(set_attr "type" "branch")
    (set_attr "length" "4,8")])
 
-;; Call to function which may be in another module.  Restore the TOC
-;; pointer (r2) after the call unless this is System V.
-;; Operand2 is nonzero if we are using the V.4 calling sequence and
-;; either the function was not prototyped, or it was prototyped as a
-;; variable argument function.  It is > 0 if FP registers were passed
-;; and < 0 if they were not.
+;; Call to indirect functions with the AIX abi using a 3 word descriptor.
+;; Operand0 is the addresss of the function to call
+;; Operand1 is the flag for System V.4 for unprototyped or FP registers
+;; Operand2 is the location in the function descriptor to load r2 from
+;; Operand3 is the stack location to hold the current TOC pointer
 
-(define_insn_and_split "*call_indirect_nonlocal_aix32_internal"
-  [(call (mem:SI (match_operand:SI 0 "register_operand" "c,*l"))
-		 (match_operand 1 "" "g,g"))
-   (use (mem:SI (plus:SI (match_operand:SI 2 "register_operand" "b,b") (const_int 4))))
-   (use (reg:SI 11))
-   (use (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_32BIT && DEFAULT_ABI == ABI_AIX"
+(define_insn_and_split "call_indirect_aix<ptrsize>"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
+	 (match_operand 1 "" "g,g"))
+   (use (match_operand:P 2 "memory_operand" "m,m"))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (use (reg:P STATIC_CHAIN_REGNUM))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && TARGET_R11"
   "#"
   "&& reload_completed"
-  [(set (reg:SI 2)
-	(mem:SI (plus:SI (match_dup 2) (const_int 4))))
+  [(set (reg:P TOC_REGNUM) (match_dup 2))
    (parallel [(call (mem:SI (match_dup 0))
 		    (match_dup 1))
-	      (use (reg:SI 2))
-	      (use (reg:SI 11))
-	      (set (reg:SI 2)
-		   (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-	      (clobber (reg:SI LR_REGNO))])]
+	      (use (reg:P TOC_REGNUM))
+	      (use (reg:P STATIC_CHAIN_REGNUM))
+	      (use (match_dup 3))
+	      (set (reg:P TOC_REGNUM) (match_dup 3))
+	      (clobber (reg:P LR_REGNO))])]
   ""
   [(set_attr "type" "jmpreg")
    (set_attr "length" "12")])
 
-(define_insn "*call_indirect_nonlocal_aix32"
-  [(call (mem:SI (match_operand:SI 0 "register_operand" "c,*l"))
+(define_insn "*call_indirect_aix<ptrsize>_internal"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
 	 (match_operand 1 "" "g,g"))
-   (use (reg:SI 2))
-   (use (reg:SI 11))
-   (set (reg:SI 2)
-	(mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_32BIT && DEFAULT_ABI == ABI_AIX && reload_completed"
-  "b%T0l\;{l|lwz} 2,20(1)"
+   (use (reg:P TOC_REGNUM))
+   (use (reg:P STATIC_CHAIN_REGNUM))
+   (use (match_operand:P 2 "memory_operand" "m,m"))
+   (set (reg:P TOC_REGNUM) (match_dup 2))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && reload_completed && TARGET_R11"
+  "b%T0l\;<ptrload> 2,%2"
   [(set_attr "type" "jmpreg")
    (set_attr "length" "8")])
 
-(define_insn "*call_nonlocal_aix32"
-  [(call (mem:SI (match_operand:SI 0 "symbol_ref_operand" "s"))
-	 (match_operand 1 "" "g"))
-   (use (match_operand:SI 2 "immediate_operand" "O"))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_32BIT
-   && DEFAULT_ABI == ABI_AIX
-   && (INTVAL (operands[2]) & CALL_LONG) == 0"
-  "bl %z0\;%."
-  [(set_attr "type" "branch")
-   (set_attr "length" "8")])
-   
-(define_insn_and_split "*call_indirect_nonlocal_aix64_internal"
-  [(call (mem:SI (match_operand:DI 0 "register_operand" "c,*l"))
-		 (match_operand 1 "" "g,g"))
-   (use (mem:DI (plus:DI (match_operand:DI 2 "register_operand" "b,b")
-			 (const_int 8))))
-   (use (reg:DI 11))
-   (use (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_64BIT && DEFAULT_ABI == ABI_AIX"
+;; Like call_indirect_aix<ptrsize>, except don't load the static chain
+;; Operand0 is the addresss of the function to call
+;; Operand1 is the flag for System V.4 for unprototyped or FP registers
+;; Operand2 is the location in the function descriptor to load r2 from
+;; Operand3 is the stack location to hold the current TOC pointer
+
+(define_insn_and_split "call_indirect_aix<ptrsize>_nor11"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
+	 (match_operand 1 "" "g,g"))
+   (use (match_operand:P 2 "memory_operand" "m,m"))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && !TARGET_R11"
   "#"
   "&& reload_completed"
-  [(set (reg:DI 2)
-	(mem:DI (plus:DI (match_dup 2) (const_int 8))))
+  [(set (reg:P TOC_REGNUM) (match_dup 2))
    (parallel [(call (mem:SI (match_dup 0))
 		    (match_dup 1))
-	      (use (reg:DI 2))
-	      (use (reg:DI 11))
-	      (set (reg:DI 2)
-		   (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-	      (clobber (reg:SI LR_REGNO))])]
+	      (use (reg:P TOC_REGNUM))
+	      (use (match_dup 3))
+	      (set (reg:P TOC_REGNUM) (match_dup 3))
+	      (clobber (reg:P LR_REGNO))])]
   ""
   [(set_attr "type" "jmpreg")
    (set_attr "length" "12")])
 
-(define_insn "*call_indirect_nonlocal_aix64"
-  [(call (mem:SI (match_operand:DI 0 "register_operand" "c,*l"))
+(define_insn "*call_indirect_aix<ptrsize>_internal2"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
 	 (match_operand 1 "" "g,g"))
-   (use (reg:DI 2))
-   (use (reg:DI 11))
-   (set (reg:DI 2)
-	(mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_64BIT && DEFAULT_ABI == ABI_AIX && reload_completed"
-  "b%T0l\;ld 2,40(1)"
+   (use (reg:P TOC_REGNUM))
+   (use (match_operand:P 2 "memory_operand" "m,m"))
+   (set (reg:P TOC_REGNUM) (match_dup 2))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && reload_completed && !TARGET_R11"
+  "b%T0l\;<ptrload> 2,%2"
   [(set_attr "type" "jmpreg")
    (set_attr "length" "8")])
 
-(define_insn "*call_nonlocal_aix64"
-  [(call (mem:SI (match_operand:DI 0 "symbol_ref_operand" "s"))
-	 (match_operand 1 "" "g"))
-   (use (match_operand:SI 2 "immediate_operand" "O"))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_64BIT
-   && DEFAULT_ABI == ABI_AIX
-   && (INTVAL (operands[2]) & CALL_LONG) == 0"
-  "bl %z0\;%."
-  [(set_attr "type" "branch")
+;; Operand0 is the return result of the function
+;; Operand1 is the addresss of the function to call
+;; Operand2 is the flag for System V.4 for unprototyped or FP registers
+;; Operand3 is the location in the function descriptor to load r2 from
+;; Operand4 is the stack location to hold the current TOC pointer
+
+(define_insn_and_split "call_value_indirect_aix<ptrsize>"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
+	      (match_operand 2 "" "g,g")))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (use (match_operand:P 4 "memory_operand" "m,m"))
+   (use (reg:P STATIC_CHAIN_REGNUM))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && TARGET_R11"
+  "#"
+  "&& reload_completed"
+  [(set (reg:P TOC_REGNUM) (match_dup 3))
+   (parallel [(set (match_dup 0)
+		   (call (mem:SI (match_dup 1))
+			 (match_dup 2)))
+	      (use (reg:P TOC_REGNUM))
+	      (use (reg:P STATIC_CHAIN_REGNUM))
+	      (use (match_dup 4))
+	      (set (reg:P TOC_REGNUM) (match_dup 4))
+	      (clobber (reg:P LR_REGNO))])]
+  ""
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "12")])
+
+(define_insn "*call_value_indirect_aix<ptrsize>_internal"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
+	      (match_operand 2 "" "g,g")))
+   (use (reg:P TOC_REGNUM))
+   (use (reg:P STATIC_CHAIN_REGNUM))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (set (reg:P TOC_REGNUM) (match_dup 3))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && reload_completed && TARGET_R11"
+  "b%T1l\;<ptrload> 2,%3"
+  [(set_attr "type" "jmpreg")
    (set_attr "length" "8")])
 
-(define_insn_and_split "*call_value_indirect_nonlocal_aix32_internal"
+;; Like call_value_indirect_aix<ptrsize>, but don't load the static chain
+;; Operand0 is the return result of the function
+;; Operand1 is the addresss of the function to call
+;; Operand2 is the flag for System V.4 for unprototyped or FP registers
+;; Operand3 is the location in the function descriptor to load r2 from
+;; Operand4 is the stack location to hold the current TOC pointer
+
+(define_insn_and_split "call_value_indirect_aix<ptrsize>_nor11"
   [(set (match_operand 0 "" "")
-	(call (mem:SI (match_operand:SI 1 "register_operand" "c,*l"))
-		      (match_operand 2 "" "g,g")))
-	(use (mem:SI (plus:SI (match_operand:SI 3 "register_operand" "b,b")
-			      (const_int 4))))
-	(use (reg:SI 11))
-	(use (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-	(clobber (reg:SI LR_REGNO))]
-  "TARGET_32BIT && DEFAULT_ABI == ABI_AIX"
+	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
+	      (match_operand 2 "" "g,g")))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (use (match_operand:P 4 "memory_operand" "m,m"))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && !TARGET_R11"
   "#"
   "&& reload_completed"
-  [(set (reg:SI 2)
-	(mem:SI (plus:SI (match_dup 3) (const_int 4))))
-   (parallel [(set (match_dup 0) (call (mem:SI (match_dup 1))
-				       (match_dup 2)))
-	      (use (reg:SI 2))
-	      (use (reg:SI 11))
-	      (set (reg:SI 2)
-		   (mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-	      (clobber (reg:SI LR_REGNO))])]
+  [(set (reg:P TOC_REGNUM) (match_dup 3))
+   (parallel [(set (match_dup 0)
+		   (call (mem:SI (match_dup 1))
+			 (match_dup 2)))
+	      (use (reg:P TOC_REGNUM))
+	      (use (match_dup 4))
+	      (set (reg:P TOC_REGNUM) (match_dup 4))
+	      (clobber (reg:P LR_REGNO))])]
   ""
   [(set_attr "type" "jmpreg")
    (set_attr "length" "12")])
 
-(define_insn "*call_value_indirect_nonlocal_aix32"
+(define_insn "*call_value_indirect_aix<ptrsize>_internal2"
   [(set (match_operand 0 "" "")
-	(call (mem:SI (match_operand:SI 1 "register_operand" "c,*l"))
+	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
 	      (match_operand 2 "" "g,g")))
-   (use (reg:SI 2))
-   (use (reg:SI 11))
-   (set (reg:SI 2)
-	(mem:SI (plus:SI (reg:SI 1) (const_int 20))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_32BIT && DEFAULT_ABI == ABI_AIX && reload_completed"
-  "b%T1l\;{l|lwz} 2,20(1)"
+   (use (reg:P TOC_REGNUM))
+   (use (match_operand:P 3 "memory_operand" "m,m"))
+   (set (reg:P TOC_REGNUM) (match_dup 3))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && reload_completed && !TARGET_R11"
+  "b%T1l\;<ptrload> 2,%3"
   [(set_attr "type" "jmpreg")
    (set_attr "length" "8")])
 
+;; Call to function which may be in another module.  Restore the TOC
+;; pointer (r2) after the call unless this is System V.
+;; Operand2 is nonzero if we are using the V.4 calling sequence and
+;; either the function was not prototyped, or it was prototyped as a
+;; variable argument function.  It is > 0 if FP registers were passed
+;; and < 0 if they were not.
+
+(define_insn "*call_nonlocal_aix32"
+  [(call (mem:SI (match_operand:SI 0 "symbol_ref_operand" "s"))
+	 (match_operand 1 "" "g"))
+   (use (match_operand:SI 2 "immediate_operand" "O"))
+   (clobber (reg:SI LR_REGNO))]
+  "TARGET_32BIT
+   && DEFAULT_ABI == ABI_AIX
+   && (INTVAL (operands[2]) & CALL_LONG) == 0"
+  "bl %z0\;%."
+  [(set_attr "type" "branch")
+   (set_attr "length" "8")])
+   
+(define_insn "*call_nonlocal_aix64"
+  [(call (mem:SI (match_operand:DI 0 "symbol_ref_operand" "s"))
+	 (match_operand 1 "" "g"))
+   (use (match_operand:SI 2 "immediate_operand" "O"))
+   (clobber (reg:SI LR_REGNO))]
+  "TARGET_64BIT
+   && DEFAULT_ABI == ABI_AIX
+   && (INTVAL (operands[2]) & CALL_LONG) == 0"
+  "bl %z0\;%."
+  [(set_attr "type" "branch")
+   (set_attr "length" "8")])
+
 (define_insn "*call_value_nonlocal_aix32"
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand:SI 1 "symbol_ref_operand" "s"))
@@ -12603,45 +12582,6 @@ (define_insn "*call_value_nonlocal_aix32
   [(set_attr "type" "branch")
    (set_attr "length" "8")])
 
-(define_insn_and_split "*call_value_indirect_nonlocal_aix64_internal"
-  [(set (match_operand 0 "" "")
-	(call (mem:SI (match_operand:DI 1 "register_operand" "c,*l"))
-		      (match_operand 2 "" "g,g")))
-	(use (mem:DI (plus:DI (match_operand:DI 3 "register_operand" "b,b")
-			      (const_int 8))))
-	(use (reg:DI 11))
-	(use (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-	(clobber (reg:SI LR_REGNO))]
-  "TARGET_64BIT && DEFAULT_ABI == ABI_AIX"
-  "#"
-  "&& reload_completed"
-  [(set (reg:DI 2)
-	(mem:DI (plus:DI (match_dup 3) (const_int 8))))
-   (parallel [(set (match_dup 0) (call (mem:SI (match_dup 1))
-				       (match_dup 2)))
-	      (use (reg:DI 2))
-	      (use (reg:DI 11))
-	      (set (reg:DI 2)
-		   (mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-	      (clobber (reg:SI LR_REGNO))])]
-  ""
-  [(set_attr "type" "jmpreg")
-   (set_attr "length" "12")])
-
-(define_insn "*call_value_indirect_nonlocal_aix64"
-  [(set (match_operand 0 "" "")
-	(call (mem:SI (match_operand:DI 1 "register_operand" "c,*l"))
-	      (match_operand 2 "" "g,g")))
-   (use (reg:DI 2))
-   (use (reg:DI 11))
-   (set (reg:DI 2)
-	(mem:DI (plus:DI (reg:DI 1) (const_int 40))))
-   (clobber (reg:SI LR_REGNO))]
-  "TARGET_64BIT && DEFAULT_ABI == ABI_AIX && reload_completed"
-  "b%T1l\;ld 2,40(1)"
-  [(set_attr "type" "jmpreg")
-   (set_attr "length" "8")])
-
 (define_insn "*call_value_nonlocal_aix64"
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand:DI 1 "symbol_ref_operand" "s"))
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 175921)
+++ gcc/doc/invoke.texi	(working copy)
@@ -807,7 +807,7 @@ See RS/6000 and PowerPC Options.
 -msdata=@var{opt}  -mvxworks  -G @var{num}  -pthread @gol
 -mrecip -mrecip=@var{opt} -mno-recip -mrecip-precision @gol
 -mno-recip-precision @gol
--mveclibabi=@var{type} -mfriz -mno-friz}
+-mveclibabi=@var{type} -mfriz -mno-friz -mr11 -mno-r11}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -16325,6 +16325,19 @@ Generate (do not generate) the @code{fri
 rounding a floating point value to 64-bit integer and back to floating
 point.  The @code{friz} instruction does not return the same value if
 the floating point number is too large to fit in an integer.
+
+@item -mr11
+@itemx -mno-r11
+@opindex mr11
+Generate (do not generate) code to load up the static chain register
+(@var{r11}) when calling through a pointer on AIX and 64-bit Linux
+systems where a function pointer points to a 3 word descriptor giving
+the function address, TOC value to be loaded in register @var{r2}, and
+static chain value to be loaded in register @var{r11}.  The
+@option{-mr11} is on by default.  You will not be able to call through
+pointers to nested functions or pointers to functions compiled in
+other languages that use the static chain if you use the
+@option{-mno-r11}.
 @end table
 
 @node RX Options
Index: gcc/testsuite/gcc.target/powerpc/no-r11-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/no-r11-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/no-r11-1.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { *-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mno-r11" } */
+
+int
+call_ptr (int (func) (void))
+{
+  return func () + 1;
+}
+
+/* { dg-final { scan-assembler-not "ld 11,16(3)" } } */
Index: gcc/testsuite/gcc.target/powerpc/no-r11-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/no-r11-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/no-r11-2.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { *-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mr11" } */
+
+int
+call_ptr (int (func) (void))
+{
+  return func () + 1;
+}
+
+/* { dg-final { scan-assembler "ld 11,16" } } */
Index: gcc/testsuite/gcc.target/powerpc/no-r11-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/no-r11-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/no-r11-3.c	(revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { *-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mno-r11" } */
+
+extern void ext_call (int (func) (void));
+
+int
+outer_func (int init)	/* { dg-error "-mno-r11 must not be used if you have trampolines" "" } */
+{
+  int value = init;
+
+  int inner (void)
+  {
+    return ++value;
+  }
+
+  ext_call (inner);
+  return value;
+}
+


More information about the Gcc-patches mailing list