This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] If ACCUMULATE_OUTGOING_ARGS, precompute CALL_EXPR arguments of another CALL_EXPR (PR middle-end/32285)


Hi!

Before the
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg00190.html
patch GCC precomputed arguments that call another function
(any kind if ACCUMULATE_OUTGOING_ARGS, otherwise just alloca).
CALL_EXPRs in another CALL_EXPR's arguments are much rarer now
with tree-ssa, but they aren't really extinct (const and pure
CALL_EXPRs still appear there).
Say on:
int foo (long long a, long long b, long long c, long long d, long long e) __attribute__((pure));
void bar (long long a, int b, long long c, long long d, long long e, int f, long long g);

int
baz (void)
{
  bar (6, foo (1, 2, 3, 4, 5), 7, 8, 9, foo (12, 12, 13, 14, 16), 11);
  return 0;
}
, i386-linux -O2 -m32, gcc 3.4 compiles this into 45 instructions, but
gcc trunk into 72 instructions.  With the patch below which restores
that behavior back were are down to 44 instructions.  As the arguments
are gimple, we don't have to chase hard for the call IMNSHO, just checking
whether the argument is a CALL_EXPR should be enough.

This patch also fixes a 4.1 regression from 3.4, where not precomputing
a CALL_EXPR leads into miscompilation.  In 4.2/4.3 that's mitigated by
IMHO overly pessimistic tree-nrv.c change, see
http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00942.html
(but with that patch applied the following is needed even there).

I'm slightly confused by the:
      /* If structure_value_addr is a REG other than
         virtual_outgoing_args_rtx, we can use always use it.  If it
         is not a REG, we must always copy it into a register.
         If it is virtual_outgoing_args_rtx, we must copy it to another
         register in some cases.  */
      rtx temp = (!REG_P (structure_value_addr)
                  || (ACCUMULATE_OUTGOING_ARGS
                      && stack_arg_under_construction
                      && structure_value_addr == virtual_outgoing_args_rtx)
                  ? copy_addr_to_reg (convert_memory_address
                                      (Pmode, structure_value_addr))
                  : structure_value_addr);
code in expand_call - apparently some older GCC at some point allowed
calls where in ACCUMULATE_OUTGOING_ARGS the structure value address
was virtual outgoing args, but:
1) don't see how it could result in correct code, when virtual_outgoing_args
are also used for the fn's arguments, it hardly can be used also for the
return value (e.g. the callee might compute the return struct first and then
say modify one of its parameters for some reason)
2) why the above is needed - virtual_outgoing_args_rtx when
ACCUMULATE_OUTGOING_ARGS should be constant, shouldn't it?
3) what's so special about virtual_outgoing_args_rtx and not on say
virtual_outgoing_args_rtx + 8?  I tried to reach this code on i?86 -m32,
but did not manage in 3.4.x - precompute_arguments already precomputed it
first.

Anyway, I believe the patch below improves generated code in addition
to fixing the bug, is that ok for 4.3/4.2/4.1?

2007-06-14  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/32285
	* calls.c (precompute_arguments): Also precompute CALL_EXPR arguments
	if ACCUMULATE_OUTGOING_ARGS.

	* gcc.c-torture/execute/20070614-1.c: New test.

--- gcc/calls.c.jj	2007-06-13 17:38:55.000000000 +0200
+++ gcc/calls.c	2007-06-14 14:50:56.000000000 +0200
@@ -1269,13 +1269,25 @@ precompute_arguments (int flags, int num
 
   /* If this is a libcall, then precompute all arguments so that we do not
      get extraneous instructions emitted as part of the libcall sequence.  */
-  if ((flags & ECF_LIBCALL_BLOCK) == 0)
+
+  /* If we preallocated the stack space, and some arguments must be passed
+     on the stack, then we must precompute any parameter which contains a
+     function call which will store arguments on the stack.
+     Otherwise, evaluating the parameter may clobber previous parameters
+     which have already been stored into the stack.  (we have code to avoid
+     such case by saving the outgoing stack arguments, but it results in
+     worse code)  */
+  if ((flags & ECF_LIBCALL_BLOCK) == 0 && !ACCUMULATE_OUTGOING_ARGS)
     return;
 
   for (i = 0; i < num_actuals; i++)
     {
       enum machine_mode mode;
 
+      if ((flags & ECF_LIBCALL_BLOCK) == 0
+	  && TREE_CODE (args[i].tree_value) != CALL_EXPR)
+	continue;
+
       /* If this is an addressable type, we cannot pre-evaluate it.  */
       gcc_assert (!TREE_ADDRESSABLE (TREE_TYPE (args[i].tree_value)));
 
--- gcc/testsuite/gcc.c-torture/execute/20070614-1.c.jj	2007-06-14 15:32:28.000000000 +0200
+++ gcc/testsuite/gcc.c-torture/execute/20070614-1.c	2007-06-11 13:23:19.000000000 +0200
@@ -0,0 +1,33 @@
+extern void abort (void);
+
+_Complex v = 3.0 + 1.0iF;
+
+void
+foo (_Complex z, int *x)
+{
+  if (z != v)
+    abort ();
+}
+
+_Complex bar (_Complex z) __attribute__ ((pure));
+_Complex
+bar (_Complex z)
+{
+  return v;
+}
+
+int
+baz (void)
+{
+  int a, i;
+  for (i = 0; i < 6; i++)
+    foo (bar (1.0iF * i), &a);
+  return 0;
+}
+
+int
+main ()
+{
+  baz ();
+  return 0;
+}


	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]