This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

fix incorrect computation of DW_CFA_GNU_args_size in nested calls


On no-accumulate-outgoing-args machines, older versions of GCC would
often leak stack space in cases that had nested calls (e.g. f(g(),h())
within loops, if say g() throws an exception.

Newer versions of GCC are not so vulnerable to the problem because
gimplification mostly hides it.  But it is still present in the only
remaining case I could find in which the old nested-calls is still
used: when memcpy is needed to copy arguments into place.

For example, compiling the attached program with current trunk or
earlier branches on x86 or x86_64 with -O0 -g -fexceptions -m32
-mtune=i486 -dA -S produces the following DW_CFA_GNU_args_size
entries:

        .long   .LCFI8-.LCFI6
        .byte   0x2e    # DW_CFA_GNU_args_size
        .uleb128 0x10
        .byte   0x4     # DW_CFA_advance_loc4
        .long   .LCFI9-.LCFI8
        .byte   0x2e    # DW_CFA_GNU_args_size
        .uleb128 0x80

corresponding to assembly code like this:

        addl    $-128, %esp
        movl    %esp, %eax
        movl    %eax, %ecx
        leal    -132(%ebp), %edx
        movl    $128, %eax
        subl    $4, %esp
        pushl   %eax
        pushl   %edx
        pushl   %ecx
.LCFI8:
        call    memcpy
        addl    $16, %esp
.LCFI9:
        call    f
        subl    $-128, %esp


Witness how the args_size for memcpy is bogus in that it fails to take
into account the stack size already used by arguments for f, that are
already reserved on the stack when memcpy is called.

The problem is that we assume we can extract the argument size from
the call instruction, but that's not true in general, since it does
not carry information about other calls whose arguments already take
up stack space.


In one of the testcases for older versions of GCC that I was given to
work on, the problem was further complicated by the fact that the
nested function call popped stack space itself, something encoded in
the RTL by a call and a SP adjustment in parallel, that we also failed
to handle in the args-size tracking code.

This patch fixes both issues, even though I don't have a testcase to
trigger the latter problem in 4.0+, but it would hit on any
no-accumulate-outgoing-args machine whose memcpy popped arguments upon
return.  Can anyone name such a machine? :-)


The testcase that currently hits is not included in the patch because
I can't figure out a way to test for the correctness of the
DW_CFA_GNU_args_size values.  Although I know the constants to test
for, I'm having trouble correlating them with label names without
imposing unwarranted assumptions on how GCC is going to generate code
and arrange labels.  Any ideas?


Richard, this is an update for mainline of the patch you approved in
private, but I'm going to wait for another round of review because the
trunk has this after_p argument that was not present before, that I
don't need any more AFAICT, but I can't understand it enough to decide
whether to remove it or to leave it alone in case it becomes useful
again.

This was bootstrapped and regression-tested on amd64-linux-gnu and
i486-linux-gnu about a week ago, reverting the changes introduced in
PR23602 that papered over the problem by forcing
-maccumulate-outgoing-args on in some (but not all) cases that
triggered the problem.  I also enclose the reversal patch, for
reference.

Ok to install the non-reversal patch?  Do you think a better
formulation of any of the bits in the reversal patch should go in too?

Attachment: t.c
Description: Binary data

for gcc/ChangeLog
2006-03-01  Alexandre Oliva  <aoliva@redhat.com>

	* dwarf2out.c (dwarf2out_stack_adjust): Always track the stack
	pointer, instead of assuming it is possible to derive the
	correct args size from a call insn.

Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c.orig	2006-02-21 16:20:49.000000000 -0300
+++ gcc/dwarf2out.c	2006-03-03 17:44:39.000000000 -0300
@@ -1059,7 +1059,7 @@ stack_adjust_offset (rtx pattern)
    much extra space it needs to pop off the stack.  */
 
 static void
-dwarf2out_stack_adjust (rtx insn, bool after_p)
+dwarf2out_stack_adjust (rtx insn, bool after_p ATTRIBUTE_UNUSED)
 {
   HOST_WIDE_INT offset;
   const char *label;
@@ -1072,31 +1072,7 @@ dwarf2out_stack_adjust (rtx insn, bool a
   if (prologue_epilogue_contains (insn) || sibcall_epilogue_contains (insn))
     return;
 
-  /* If only calls can throw, and we have a frame pointer,
-     save up adjustments until we see the CALL_INSN.  */
-  if (!flag_asynchronous_unwind_tables && cfa.reg != STACK_POINTER_REGNUM)
-    {
-      if (CALL_P (insn) && !after_p)
-	{
-	  /* Extract the size of the args from the CALL rtx itself.  */
-	  insn = PATTERN (insn);
-	  if (GET_CODE (insn) == PARALLEL)
-	    insn = XVECEXP (insn, 0, 0);
-	  if (GET_CODE (insn) == SET)
-	    insn = SET_SRC (insn);
-	  gcc_assert (GET_CODE (insn) == CALL);
-	  dwarf2out_args_size ("", INTVAL (XEXP (insn, 1)));
-	}
-      return;
-    }
-
-  if (CALL_P (insn) && !after_p)
-    {
-      if (!flag_asynchronous_unwind_tables)
-	dwarf2out_args_size ("", args_size);
-      return;
-    }
-  else if (BARRIER_P (insn))
+  if (BARRIER_P (insn))
     {
       /* When we see a BARRIER, we know to reset args_size to 0.  Usually
 	 the compiler will have already emitted a stack adjustment, but
@@ -1118,9 +1094,20 @@ dwarf2out_stack_adjust (rtx insn, bool a
 	if (GET_CODE (XVECEXP (PATTERN (insn), 0, i)) == SET)
 	  offset += stack_adjust_offset (XVECEXP (PATTERN (insn), 0, i));
     }
+  else if (GET_CODE (insn) == CALL_INSN)
+    offset = 0;
   else
     return;
 
+  /* We handle this separately because we want stack adjustments in a
+     CALL_INSN to be handled.  */;
+  if (GET_CODE (insn) == CALL_INSN)
+    {
+      /* If only calls can throw, adjust args_size only at call sites.  */
+      if (!flag_asynchronous_unwind_tables)
+	dwarf2out_args_size ("", args_size);
+    }
+
   if (offset == 0)
     return;
 
@@ -1135,6 +1122,16 @@ dwarf2out_stack_adjust (rtx insn, bool a
   if (args_size < 0)
     args_size = 0;
 
+  /* If only calls can throw and we have a frame pointer, we'll save
+     up adjustments until we see the CALL_INSN.  We used to return
+     early and derive args_size from NARGS in the CALL_INSN itself,
+     but that doesn't compute the right value if we have nested call
+     expansions, e.g., stack adjustments for a call have already been
+     emitted, and then we issue another call to compute an argument
+     for the enclosing call (i.e., bar (foo ())).  */
+  if (!flag_asynchronous_unwind_tables && cfa.reg != STACK_POINTER_REGNUM)
+    return;
+
   label = dwarf2out_cfi_label ();
   def_cfa_1 (label, &cfa);
   if (flag_asynchronous_unwind_tables)
Index: trunk/gcc/config/i386/i386.c
===================================================================
--- trunk.orig/gcc/config/i386/i386.c	2006-03-03 16:21:21.000000000 -0300
+++ trunk/gcc/config/i386/i386.c	2006-03-03 16:21:21.000000000 -0300
@@ -1956,7 +1956,7 @@ override_options (void)
      pointer is present or M_A_O_A is set.  Fixing this requires rewriting
      unwind info generation to be aware of the CFG and propagating states
      around edges.  */
-  if ((flag_unwind_tables || flag_asynchronous_unwind_tables
+  if (0 && (flag_unwind_tables || flag_asynchronous_unwind_tables
        || flag_exceptions || flag_non_call_exceptions)
       && flag_omit_frame_pointer
       && !(target_flags & MASK_ACCUMULATE_OUTGOING_ARGS))
Index: trunk/gcc/toplev.c
===================================================================
--- trunk.orig/gcc/toplev.c	2006-03-02 03:18:48.000000000 -0300
+++ trunk/gcc/toplev.c	2006-03-03 16:21:21.000000000 -0300
@@ -1808,7 +1808,7 @@ process_options (void)
      pointer is present or A_O_A is set.  Fixing this requires rewriting
      unwind info generation to be aware of the CFG and propagating states
      around edges.  */
-  if (flag_unwind_tables && !ACCUMULATE_OUTGOING_ARGS
+  if (0 && flag_unwind_tables && !ACCUMULATE_OUTGOING_ARGS
       && flag_omit_frame_pointer)
     {
       warning (0, "unwind tables currently requires a frame pointer "
Index: trunk/libjava/configure.host
===================================================================
--- trunk.orig/libjava/configure.host	2006-03-02 03:18:48.000000000 -0300
+++ trunk/libjava/configure.host	2006-03-04 09:55:39.000000000 -0300
@@ -92,27 +92,18 @@ case "${host}" in
 	;;
   i686-*|i586-*|i486-*|i386-*)
 	sysdeps_dir=i386
-	# With -fomit-frame-pointer -maccumulate-outgoing-args (implied),
-	# the .text section of libgcj.so is 30k larger, and the .eh_frame
-	# section is 1.4M smaller.
-	libgcj_flags="${libgcj_flags} -ffloat-store -fomit-frame-pointer"
+	libgcj_flags="${libgcj_flags} -ffloat-store"
 	libgcj_interpreter=yes
 	libgcj_cxxflags=
 	libgcj_cflags=
 	DIVIDESPEC=-fno-use-divide-subroutine
+	BACKTRACESPEC=-fomit-frame-pointer
 	enable_hash_synchronization_default=yes
 	slow_pthread_self=yes
 	;;
   x86_64-*)
 	sysdeps_dir=x86-64
-	# For 64-bit we always use SSE registers for arithmetic,
-	# which doesn't have the extra precision problems of the fpu.
-	# But be careful about 32-bit multilibs.
-	case " $CC " in
-	*" -m32 "*)
-		libgcj_flags="${libgcj_flags} -ffloat-store" ;;
-	esac
-	libgcj_flags="${libgcj_flags} -fomit-frame-pointer"
+	libgcj_flags="${libgcj_flags} -ffloat-store"
 	libgcj_cxxflags=
 	libgcj_cflags=
 	DIVIDESPEC=-f%{m32:no-}use-divide-subroutine
-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
Secretary for FSF Latin America        http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]