This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
On no-accumulate-outgoing-args machines, older versions of GCC would
often leak stack space in cases that had nested calls (e.g. f(g(),h())
within loops, if say g() throws an exception.
Newer versions of GCC are not so vulnerable to the problem because
gimplification mostly hides it. But it is still present in the only
remaining case I could find in which the old nested-calls is still
used: when memcpy is needed to copy arguments into place.
For example, compiling the attached program with current trunk or
earlier branches on x86 or x86_64 with -O0 -g -fexceptions -m32
-mtune=i486 -dA -S produces the following DW_CFA_GNU_args_size
entries:
.long .LCFI8-.LCFI6
.byte 0x2e # DW_CFA_GNU_args_size
.uleb128 0x10
.byte 0x4 # DW_CFA_advance_loc4
.long .LCFI9-.LCFI8
.byte 0x2e # DW_CFA_GNU_args_size
.uleb128 0x80
corresponding to assembly code like this:
addl $-128, %esp
movl %esp, %eax
movl %eax, %ecx
leal -132(%ebp), %edx
movl $128, %eax
subl $4, %esp
pushl %eax
pushl %edx
pushl %ecx
.LCFI8:
call memcpy
addl $16, %esp
.LCFI9:
call f
subl $-128, %esp
Witness how the args_size for memcpy is bogus in that it fails to take
into account the stack size already used by arguments for f, that are
already reserved on the stack when memcpy is called.
The problem is that we assume we can extract the argument size from
the call instruction, but that's not true in general, since it does
not carry information about other calls whose arguments already take
up stack space.
In one of the testcases for older versions of GCC that I was given to
work on, the problem was further complicated by the fact that the
nested function call popped stack space itself, something encoded in
the RTL by a call and a SP adjustment in parallel, that we also failed
to handle in the args-size tracking code.
This patch fixes both issues, even though I don't have a testcase to
trigger the latter problem in 4.0+, but it would hit on any
no-accumulate-outgoing-args machine whose memcpy popped arguments upon
return. Can anyone name such a machine? :-)
The testcase that currently hits is not included in the patch because
I can't figure out a way to test for the correctness of the
DW_CFA_GNU_args_size values. Although I know the constants to test
for, I'm having trouble correlating them with label names without
imposing unwarranted assumptions on how GCC is going to generate code
and arrange labels. Any ideas?
Richard, this is an update for mainline of the patch you approved in
private, but I'm going to wait for another round of review because the
trunk has this after_p argument that was not present before, that I
don't need any more AFAICT, but I can't understand it enough to decide
whether to remove it or to leave it alone in case it becomes useful
again.
This was bootstrapped and regression-tested on amd64-linux-gnu and
i486-linux-gnu about a week ago, reverting the changes introduced in
PR23602 that papered over the problem by forcing
-maccumulate-outgoing-args on in some (but not all) cases that
triggered the problem. I also enclose the reversal patch, for
reference.
Ok to install the non-reversal patch? Do you think a better
formulation of any of the bits in the reversal patch should go in too?
Attachment:
t.c
Description: Binary data
for gcc/ChangeLog
2006-03-01 Alexandre Oliva <aoliva@redhat.com>
* dwarf2out.c (dwarf2out_stack_adjust): Always track the stack
pointer, instead of assuming it is possible to derive the
correct args size from a call insn.
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c.orig 2006-02-21 16:20:49.000000000 -0300
+++ gcc/dwarf2out.c 2006-03-03 17:44:39.000000000 -0300
@@ -1059,7 +1059,7 @@ stack_adjust_offset (rtx pattern)
much extra space it needs to pop off the stack. */
static void
-dwarf2out_stack_adjust (rtx insn, bool after_p)
+dwarf2out_stack_adjust (rtx insn, bool after_p ATTRIBUTE_UNUSED)
{
HOST_WIDE_INT offset;
const char *label;
@@ -1072,31 +1072,7 @@ dwarf2out_stack_adjust (rtx insn, bool a
if (prologue_epilogue_contains (insn) || sibcall_epilogue_contains (insn))
return;
- /* If only calls can throw, and we have a frame pointer,
- save up adjustments until we see the CALL_INSN. */
- if (!flag_asynchronous_unwind_tables && cfa.reg != STACK_POINTER_REGNUM)
- {
- if (CALL_P (insn) && !after_p)
- {
- /* Extract the size of the args from the CALL rtx itself. */
- insn = PATTERN (insn);
- if (GET_CODE (insn) == PARALLEL)
- insn = XVECEXP (insn, 0, 0);
- if (GET_CODE (insn) == SET)
- insn = SET_SRC (insn);
- gcc_assert (GET_CODE (insn) == CALL);
- dwarf2out_args_size ("", INTVAL (XEXP (insn, 1)));
- }
- return;
- }
-
- if (CALL_P (insn) && !after_p)
- {
- if (!flag_asynchronous_unwind_tables)
- dwarf2out_args_size ("", args_size);
- return;
- }
- else if (BARRIER_P (insn))
+ if (BARRIER_P (insn))
{
/* When we see a BARRIER, we know to reset args_size to 0. Usually
the compiler will have already emitted a stack adjustment, but
@@ -1118,9 +1094,20 @@ dwarf2out_stack_adjust (rtx insn, bool a
if (GET_CODE (XVECEXP (PATTERN (insn), 0, i)) == SET)
offset += stack_adjust_offset (XVECEXP (PATTERN (insn), 0, i));
}
+ else if (GET_CODE (insn) == CALL_INSN)
+ offset = 0;
else
return;
+ /* We handle this separately because we want stack adjustments in a
+ CALL_INSN to be handled. */;
+ if (GET_CODE (insn) == CALL_INSN)
+ {
+ /* If only calls can throw, adjust args_size only at call sites. */
+ if (!flag_asynchronous_unwind_tables)
+ dwarf2out_args_size ("", args_size);
+ }
+
if (offset == 0)
return;
@@ -1135,6 +1122,16 @@ dwarf2out_stack_adjust (rtx insn, bool a
if (args_size < 0)
args_size = 0;
+ /* If only calls can throw and we have a frame pointer, we'll save
+ up adjustments until we see the CALL_INSN. We used to return
+ early and derive args_size from NARGS in the CALL_INSN itself,
+ but that doesn't compute the right value if we have nested call
+ expansions, e.g., stack adjustments for a call have already been
+ emitted, and then we issue another call to compute an argument
+ for the enclosing call (i.e., bar (foo ())). */
+ if (!flag_asynchronous_unwind_tables && cfa.reg != STACK_POINTER_REGNUM)
+ return;
+
label = dwarf2out_cfi_label ();
def_cfa_1 (label, &cfa);
if (flag_asynchronous_unwind_tables)
Index: trunk/gcc/config/i386/i386.c
===================================================================
--- trunk.orig/gcc/config/i386/i386.c 2006-03-03 16:21:21.000000000 -0300
+++ trunk/gcc/config/i386/i386.c 2006-03-03 16:21:21.000000000 -0300
@@ -1956,7 +1956,7 @@ override_options (void)
pointer is present or M_A_O_A is set. Fixing this requires rewriting
unwind info generation to be aware of the CFG and propagating states
around edges. */
- if ((flag_unwind_tables || flag_asynchronous_unwind_tables
+ if (0 && (flag_unwind_tables || flag_asynchronous_unwind_tables
|| flag_exceptions || flag_non_call_exceptions)
&& flag_omit_frame_pointer
&& !(target_flags & MASK_ACCUMULATE_OUTGOING_ARGS))
Index: trunk/gcc/toplev.c
===================================================================
--- trunk.orig/gcc/toplev.c 2006-03-02 03:18:48.000000000 -0300
+++ trunk/gcc/toplev.c 2006-03-03 16:21:21.000000000 -0300
@@ -1808,7 +1808,7 @@ process_options (void)
pointer is present or A_O_A is set. Fixing this requires rewriting
unwind info generation to be aware of the CFG and propagating states
around edges. */
- if (flag_unwind_tables && !ACCUMULATE_OUTGOING_ARGS
+ if (0 && flag_unwind_tables && !ACCUMULATE_OUTGOING_ARGS
&& flag_omit_frame_pointer)
{
warning (0, "unwind tables currently requires a frame pointer "
Index: trunk/libjava/configure.host
===================================================================
--- trunk.orig/libjava/configure.host 2006-03-02 03:18:48.000000000 -0300
+++ trunk/libjava/configure.host 2006-03-04 09:55:39.000000000 -0300
@@ -92,27 +92,18 @@ case "${host}" in
;;
i686-*|i586-*|i486-*|i386-*)
sysdeps_dir=i386
- # With -fomit-frame-pointer -maccumulate-outgoing-args (implied),
- # the .text section of libgcj.so is 30k larger, and the .eh_frame
- # section is 1.4M smaller.
- libgcj_flags="${libgcj_flags} -ffloat-store -fomit-frame-pointer"
+ libgcj_flags="${libgcj_flags} -ffloat-store"
libgcj_interpreter=yes
libgcj_cxxflags=
libgcj_cflags=
DIVIDESPEC=-fno-use-divide-subroutine
+ BACKTRACESPEC=-fomit-frame-pointer
enable_hash_synchronization_default=yes
slow_pthread_self=yes
;;
x86_64-*)
sysdeps_dir=x86-64
- # For 64-bit we always use SSE registers for arithmetic,
- # which doesn't have the extra precision problems of the fpu.
- # But be careful about 32-bit multilibs.
- case " $CC " in
- *" -m32 "*)
- libgcj_flags="${libgcj_flags} -ffloat-store" ;;
- esac
- libgcj_flags="${libgcj_flags} -fomit-frame-pointer"
+ libgcj_flags="${libgcj_flags} -ffloat-store"
libgcj_cxxflags=
libgcj_cflags=
DIVIDESPEC=-f%{m32:no-}use-divide-subroutine
--
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
Secretary for FSF Latin America http://www.fsfla.org/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |