This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fixes for powerpc-linux param passing


This patch cures a number of powerpc-linux parameter passing problems.

1) rs6000_va_arg disagreed with function_arg on where floating point
   types live, apart from SFmode, DFmode and TFmode.  function_arg
   passes other floating point types, including all complex floats, in
   gprs, whereas rs6000_va_arg expected them in fp regs.

2) rs6000_va_arg disagreed with function_arg on arg alignment.
   function_arg/function_arg_advance aligned to an even word for DFmode
   and when exactly 2 gprs were needed for args.  rs6000_va_arg aligned
   when more than one word was needed for args, a mismatch on something
   like complex long long.

3) function_arg_boundary disagreed with function_arg_advance on
   alignment for two word args that didn't happen to be either DImode,
   DFmode or a SPE vector.  This doesn't cause a problem until you
   exhaust gprs for arg passing, at which point the generic function
   calling code lays out the stack according to function_arg_boundary,
   but the rs6000 backend assumes larger alignment.  Only causes a
   problem with variable argument functions with a large number of
   fixed arguments.  Presumably such functions are rare.

4) No accounting for stack space used by AltiVec args when -mabi=altivec
   and we run out of AltiVec registers.  Again, only causes a problem
   with variable argument functions with a large number of fixed vector
   arguments.

5) Passing vectors when -maltivec and -mabi=no-altivec was broken for
   reasons detailed in http://gcc.gnu.org/ml/gcc/2004-04/msg01316.html
   Vectors need to be aligned.  Also, -maltivec shouldn't change the
   function passing mechanism, which means both
   "-mabi=no-altivec -mno-altivec" and "-mabi=no-altivec -maltivec" need
   to change.  In this patch, I've chosen to pass altivec vectors by
   reference.  If passing by value, alignment constraints allow only one
   vector to be passed in regs, in r5,r6,r7,r8.

My fix for (5) does mean an ABI change, unfortunately.  I'm not sure how
to go about implementing altivec loads and stores to unaligned
locations.  The hardware can do it using multiple instructions and a
few temp vector regs.  I did try adding the following to rs6000_emit_move
to see whether the idea is feasible:

@@ -3573,7 +3573,25 @@ rs6000_emit_move (rtx dest, rtx source, 
       return;
     }
 
+  if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
+    {
+      /* Loads and stores using AltiVec instructions must be 16 byte
+	 aligned, at least when transferring full vectors.  */
+      if (GET_CODE (operands[0]) == MEM
+	  && MEM_ALIGN (operands[0]) < 128)
+	{
+	  debug_rtx (operands[0]);
+	  abort ();
+	}
+      if (GET_CODE (operands[1]) == MEM
+	  && MEM_ALIGN (operands[1]) < 128)
+	{
+	  debug_rtx (operands[1]);
+	  abort ();
+	}
+    }
+
   if (!no_new_pseudos)
     {
       if (GET_CODE (operands[1]) == MEM && optimize > 0

This resulted in
(mem:V4SI (reg:SI 126) [0 S16 A8])
.../gcc.dg/altivec-varargs-1.c: In function `foo':
.../gcc.dg/altivec-varargs-1.c:25: internal compiler error: in rs6000_emit_move
We're looking at the result of a MEM returned from expand_builtin_va_arg,
so that's the first thing that would need fixing..


Anyhow, the following was tested powerpc-linux and powerpc64-linux, on
both altivec and non-altivec hardware, no regressions.  Fixes the
following failures:

FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O0 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O1 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O2 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -Os 
FAIL: gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: gcc.dg/compat/scalar-return-3 c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: gcc.dg/complex-1.c execution test
FAIL: g++.dg/ext/altivec-3.C execution test

	* config/rs6000/rs6000.c (function_arg_boundary): Align for ABI_V4
	when size is 8 bytes.
	(function_arg_advance): Account for stack space used by AltiVec
	args when -mabi=altivec.  Simplify alignment calculations.  For 
	ABI_V4, pass AltiVec vectors by reference when -mabi=no-altivec.
	(function_arg): Similarly.
	(function_arg_pass_by_reference): True for ABI_V4 AltiVec when
	not AltiVec ABI.
	(rs6000_va_arg): Correct fp arg test.  Adjust for AltiVec change.
	Correct alignment, and align before testing reg count.  Remove
	TREE_THIS_VOLATILE from reg.  Don't emit unused labels.
	(rs6000_complex_function_value): Check TARGET_HARD_FLOAT and
	TARGET_FPRS here..
	(rs6000_function_value): .. not here before call.

Some notes on the patch:
- Aldy added TREE_THIS_VOLATILE in
  http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00409.html without any
  explanation.  I think it is no longer needed.  Not surprisingly, it
  pessimises the code.
- Aligning reg before comparing in rs6000_va_arg isn't a bug fix.  I
  just think it's neater that way as you then don't need the code that
  later sets reg to 8.

OK to install mainline?  Failing that, OK to install without the changes
to pass AltiVec by refernce?


--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-05-06 15:10:07.000000000 +0930
+++ gcc-mainline/gcc/config/rs6000/rs6000.c	2004-05-08 22:20:52.000000000 +0930
@@ -4065,10 +4083,10 @@ function_arg_padding (enum machine_mode 
 int
 function_arg_boundary (enum machine_mode mode, tree type ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4 && (mode == DImode || mode == DFmode))
+  if (DEFAULT_ABI == ABI_V4 && GET_MODE_SIZE (mode) == 8)
+    return 64;
+  else if (SPE_VECTOR_MODE (mode))
     return 64;
-   else if (SPE_VECTOR_MODE (mode))
-     return 64;
   else if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
     return 128;
   else
@@ -4105,6 +4123,8 @@ function_arg_advance (CUMULATIVE_ARGS *c
 
   if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
     {
+      bool stack = false;
+
       if (USE_ALTIVEC_FOR_ARG_P (cum, mode, type, named))
         {
 	  cum->vregno++;
@@ -4112,12 +4132,18 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	    error ("Cannot pass argument in vector register because"
 		   " altivec instructions are disabled, use -maltivec"
 		   " to enable them.");
+
+	  /* PowerPC64 Linux and AIX allocate GPRs for a vector argument
+	     even if it is going to be passed in a vector register.  
+	     Darwin does the same for variable-argument functions.  */
+	  if ((DEFAULT_ABI == ABI_AIX && TARGET_64BIT)
+	      || (cum->stdarg && DEFAULT_ABI != ABI_V4))
+	    stack = true;
 	}
-      /* PowerPC64 Linux and AIX allocates GPRs for a vector argument
-	 even if it is going to be passed in a vector register.  
-	 Darwin does the same for variable-argument functions.  */
-      if ((DEFAULT_ABI == ABI_AIX && TARGET_64BIT)
-		   || (cum->stdarg && DEFAULT_ABI != ABI_V4))
+      else
+	stack = true;
+
+      if (stack)
         {
 	  int align;
 	  
@@ -4129,7 +4155,7 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	     aligned.  Space for GPRs is reserved even if the argument
 	     will be passed in memory.  */
 	  if (TARGET_32BIT)
-	    align = ((6 - (cum->words & 3)) & 3);
+	    align = (2 - cum->words) & 3;
 	  else
 	    align = cum->words & 1;
 	  cum->words += align + rs6000_arg_size (mode, type);
@@ -4167,22 +4193,27 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	  int n_words;
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates and IEEE quad get passed by reference.  */
+	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
+	     reference.  */
 	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode)
+	      || mode == TFmode
+	      || ALTIVEC_VECTOR_MODE (mode))
 	    n_words = 1;
 	  else 
 	    n_words = rs6000_arg_size (mode, type);
 
-	  /* Long long and SPE vectors are put in odd registers.  */
-	  if (n_words == 2 && (gregno & 1) == 0)
-	    gregno += 1;
+	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
+	     (r7,r8) or (r9,r10).  As does any other 2 word item such
+	     as complex int due to a historical mistake.  */
+	  if (n_words == 2)
+	    gregno += (1 - gregno) & 1;
 
-	  /* Long long and SPE vectors are not split between registers
-	     and stack.  */
+	  /* Multi-reg args are not split between registers and stack.  */
 	  if (gregno + n_words - 1 > GP_ARG_MAX_REG)
 	    {
-	      /* Long long is aligned on the stack.  */
+	      /* Long long and SPE vectors are aligned on the stack.
+		 So are other 2 word items such as complex int due to
+		 a historical mistake.  */
 	      if (n_words == 2)
 		cum->words += cum->words & 1;
 	      cum->words += n_words;
@@ -4465,7 +4496,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	     they just have to start on an even word, since the parameter
 	     save area is 16-byte aligned.  */
 	  if (TARGET_32BIT)
-	    align = ((6 - (cum->words & 3)) & 3);
+	    align = (2 - cum->words) & 3;
 	  else
 	    align = cum->words & 1;
 	  align_words = cum->words + align;
@@ -4502,18 +4533,22 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	  int n_words;
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates and IEEE quad get passed by reference.  */
+	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
+	     reference.  */
 	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode)
+	      || mode == TFmode
+	      || ALTIVEC_VECTOR_MODE (mode))
 	    n_words = 1;
 	  else 
 	    n_words = rs6000_arg_size (mode, type);
 
-	  /* Long long and SPE vectors are put in odd registers.  */
-	  if (n_words == 2 && (gregno & 1) == 0)
-	    gregno += 1;
+	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
+	     (r7,r8) or (r9,r10).  As does any other 2 word item such
+	     as complex int due to a historical mistake.  */
+	  if (n_words == 2)
+	    gregno += (1 - gregno) & 1;
 
-	  /* Long long does not split between registers and stack.  */
+	  /* Multi-reg args are not split between registers and stack.  */
 	  if (gregno + n_words - 1 <= GP_ARG_MAX_REG)
 	    return gen_rtx_REG (mode, gregno);
 	  else
@@ -4659,7 +4694,8 @@ function_arg_pass_by_reference (CUMULATI
 {
   if (DEFAULT_ABI == ABI_V4
       && ((type && AGGREGATE_TYPE_P (type))
-	  || mode == TFmode))
+	  || mode == TFmode
+	  || (!TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))))
     {
       if (TARGET_DEBUG_ARG)
 	fprintf (stderr, "function_arg_pass_by_reference: aggregate\n");
@@ -4913,6 +4948,7 @@ rs6000_va_arg (tree valist, tree type)
   tree gpr, fpr, ovf, sav, reg, t, u;
   int indirect_p, size, rsize, n_reg, sav_ofs, sav_scale;
   rtx lab_false, lab_over, addr_rtx, r;
+  int align;
 
   if (DEFAULT_ABI != ABI_V4)
     {
@@ -4986,10 +5022,14 @@ rs6000_va_arg (tree valist, tree type)
 
   size = int_size_in_bytes (type);
   rsize = (size + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  align = 1;
 
-  if (AGGREGATE_TYPE_P (type) || TYPE_MODE (type) == TFmode)
+  if (AGGREGATE_TYPE_P (type)
+      || TYPE_MODE (type) == TFmode
+      || (!TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (TYPE_MODE (type))))
     {
-      /* Aggregates and long doubles are passed by reference.  */
+      /* Aggregates, long doubles, and AltiVec vectors are passed by
+	 reference.  */
       indirect_p = 1;
       reg = gpr;
       n_reg = 1;
@@ -4998,7 +5038,8 @@ rs6000_va_arg (tree valist, tree type)
       size = UNITS_PER_WORD;
       rsize = 1;
     }
-  else if (FLOAT_TYPE_P (type) && TARGET_HARD_FLOAT && TARGET_FPRS)
+  else if (TARGET_HARD_FLOAT && TARGET_FPRS
+	   && (TYPE_MODE (type) == SFmode || TYPE_MODE (type) == DFmode))
     {
       /* FP args go in FP registers, if present.  */
       indirect_p = 0;
@@ -5006,6 +5047,8 @@ rs6000_va_arg (tree valist, tree type)
       n_reg = 1;
       sav_ofs = 8*4;
       sav_scale = 8;
+      if (TYPE_MODE (type) == DFmode)
+	align = 8;
     }
   else
     {
@@ -5015,38 +5058,43 @@ rs6000_va_arg (tree valist, tree type)
       n_reg = rsize;
       sav_ofs = 0;
       sav_scale = 4;
+      if (n_reg == 2)
+	align = 8;
     }
 
   /* Pull the value out of the saved registers....  */
 
-  lab_false = gen_label_rtx ();
-  lab_over = gen_label_rtx ();
+  lab_over = NULL_RTX;
   addr_rtx = gen_reg_rtx (Pmode);
 
-  /*  AltiVec vectors never go in registers.  */
-  if (!TARGET_ALTIVEC || TREE_CODE (type) != VECTOR_TYPE)
+  /*  AltiVec vectors never go in registers when -mabi=altivec.  */
+  if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (TYPE_MODE (type)))
+    align = 16;
+  else
     {
-      TREE_THIS_VOLATILE (reg) = 1;
-      emit_cmp_and_jump_insns
-	(expand_expr (reg, NULL_RTX, QImode, EXPAND_NORMAL),
-	 GEN_INT (8 - n_reg + 1), GE, const1_rtx, QImode, 1,
-	 lab_false);
-
-      /* Long long is aligned in the registers.  */
-      if (n_reg > 1)
+      lab_false = gen_label_rtx ();
+      lab_over = gen_label_rtx ();
+
+      /* Long long and SPE vectors are aligned in the registers.
+	 As are any other 2 gpr item such as complex int due to a
+	 historical mistake.  */
+      u = reg;
+      if (n_reg == 2)
 	{
 	  u = build (BIT_AND_EXPR, TREE_TYPE (reg), reg,
 		     build_int_2 (n_reg - 1, 0));
-	  u = build (PLUS_EXPR, TREE_TYPE (reg), reg, u);
-	  u = build (MODIFY_EXPR, TREE_TYPE (reg), reg, u);
+	  u = build (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, u);
 	  TREE_SIDE_EFFECTS (u) = 1;
-	  expand_expr (u, const0_rtx, VOIDmode, EXPAND_NORMAL);
 	}
 
+      emit_cmp_and_jump_insns
+	(expand_expr (u, NULL_RTX, QImode, EXPAND_NORMAL),
+	 GEN_INT (8 - n_reg + 1), GE, const1_rtx, QImode, 1,
+	 lab_false);
+
+      t = sav;
       if (sav_ofs)
 	t = build (PLUS_EXPR, ptr_type_node, sav, build_int_2 (sav_ofs, 0));
-      else
-	t = sav;
 
       u = build (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg,
 		 build_int_2 (n_reg, 0));
@@ -5067,40 +5115,18 @@ rs6000_va_arg (tree valist, tree type)
 
       emit_jump_insn (gen_jump (lab_over));
       emit_barrier ();
-    }
 
-  emit_label (lab_false);
+      emit_label (lab_false);
+    }
 
   /* ... otherwise out of the overflow area.  */
 
-  /* Make sure we don't find reg 7 for the next int arg.
-
-     All AltiVec vectors go in the overflow area.  So in the AltiVec
-     case we need to get the vectors from the overflow area, but
-     remember where the GPRs and FPRs are.  */
-  if (n_reg > 1 && (TREE_CODE (type) != VECTOR_TYPE
-		    || !TARGET_ALTIVEC))
-    {
-      t = build (MODIFY_EXPR, TREE_TYPE (reg), reg, build_int_2 (8, 0));
-      TREE_SIDE_EFFECTS (t) = 1;
-      expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
-    }
-
   /* Care for on-stack alignment if needed.  */
-  if (rsize <= 1)
-    t = ovf;
-  else
+  t = ovf;
+  if (align != 1)
     {
-      int align;
-
-      /* AltiVec vectors are 16 byte aligned.  */
-      if (TARGET_ALTIVEC && TREE_CODE (type) == VECTOR_TYPE)
-	align = 15;
-      else
-	align = 7;
-
-      t = build (PLUS_EXPR, TREE_TYPE (ovf), ovf, build_int_2 (align, 0));
-      t = build (BIT_AND_EXPR, TREE_TYPE (t), t, build_int_2 (-align-1, -1));
+      t = build (PLUS_EXPR, TREE_TYPE (t), t, build_int_2 (align - 1, 0));
+      t = build (BIT_AND_EXPR, TREE_TYPE (t), t, build_int_2 (-align, -1));
     }
   t = save_expr (t);
 
@@ -5113,7 +5139,8 @@ rs6000_va_arg (tree valist, tree type)
   TREE_SIDE_EFFECTS (t) = 1;
   expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
 
-  emit_label (lab_over);
+  if (lab_over)
+    emit_label (lab_over);
 
   if (indirect_p)
     {
@@ -16199,7 +16226,7 @@ rs6000_complex_function_value (enum mach
   enum machine_mode inner = GET_MODE_INNER (mode);
   unsigned int inner_bytes = GET_MODE_SIZE (inner);
 
-  if (FLOAT_MODE_P (mode))
+  if (FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     regno = FP_ARG_RETURN;
   else
     {
@@ -16257,10 +16284,9 @@ rs6000_function_value (tree valtype, tre
   else
     mode = TYPE_MODE (valtype);
 
-  if (TREE_CODE (valtype) == REAL_TYPE && TARGET_HARD_FLOAT && TARGET_FPRS)
+  if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS)
     regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
-	   && TARGET_HARD_FLOAT
 	   && targetm.calls.split_complex_arg)
     return rs6000_complex_function_value (mode);
   else if (TREE_CODE (valtype) == VECTOR_TYPE

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]