gcc.dg/torture/stackalign/non-local-goto-4.c and several similar test cases fail with current 4.6 on m68k-linux. Bisection identified r160124 as the cause: Author: hubicka Date: Tue Jun 1 21:55:49 2010 New Revision: 160124 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=160124 Log: * ipa-pure-const.c (local_pure_const): Do NORETURN discovery. The essence of non-local-goto-4.c is a function x() with a local self-recursive function y() and a non-local goto from y() to x(): x(a) { __label__ xlab; void y(a) { if (a==0) goto xlab; y (a-1); } y (a); xlab:; return a; } With r160123, gcc -O3 -fomit-frame-pointer -S generates this for x() and y(): .text .align 2 .type y.1201, @function y.1201: link.w %fp,#0 move.l #.L2,%a1 move.l (%a0),%fp move.l 4(%a0),%sp jmp (%a1) .size y.1201, .-y.1201 .align 2 .globl x .type x, @function x: lea (-12,%sp),%sp fmovem #252,-(%sp) movem.l #16190,-(%sp) lea (128,%sp),%a0 move.l %a0,116(%sp) move.l %sp,120(%sp) move.l 132(%sp),-(%sp) lea (120,%sp),%a0 jsr y.1201 addq.l #4,%sp .L4: move.l 132(%sp),%d0 movem.l (%sp)+,#31996 fmovem (%sp)+,#63 lea (12,%sp),%sp rts .L2: move.l 132(%sp),%d0 movem.l (%sp)+,#31996 fmovem (%sp)+,#63 lea (12,%sp),%sp rts .size x, .-x This works. But r160124 changes the code to: .align 2 .type y.1201, @function y.1201: link.w %fp,#0 move.l #.L2,%a1 move.l (%a0),%fp move.l 4(%a0),%sp jmp (%a1) .size y.1201, .-y.1201 .align 2 .globl x .type x, @function x: lea (-12,%sp),%sp fmovem #252,-(%sp) movem.l #16190,-(%sp) lea (128,%sp),%a0 move.l %a0,116(%sp) move.l %sp,120(%sp) move.l 132(%sp),-(%sp) lea (120,%sp),%a0 jsr y.1201 .L2: .L4: move.l 136(%sp),%d0 movem.l (%sp)+,#31996 fmovem (%sp)+,#63 lea (12,%sp),%sp rts .size x, .-x A diff shows that L2 and L4 have been merged, a stack increment before L4 has been removed and the stack offset for loading %d0 in L4 has been adjusted for that. However, the path leading up to L2 is the same as before, but L2 is now the same as L4 so uses an incorrect (off-by-4) stack offset when fetching %d2. --- non-local-goto-4.s-r160123 2011-02-27 23:44:28.000000000 +0100 +++ non-local-goto-4.s-r160124 2011-02-27 23:36:27.000000000 +0100 @@ -23,15 +23,9 @@ move.l 132(%sp),-(%sp) lea (120,%sp),%a0 jsr y.1201 - addq.l #4,%sp -.L4: - move.l 132(%sp),%d0 - movem.l (%sp)+,#31996 - fmovem (%sp)+,#63 - lea (12,%sp),%sp - rts .L2: - move.l 132(%sp),%d0 +.L4: + move.l 136(%sp),%d0 movem.l (%sp)+,#31996 fmovem (%sp)+,#63 lea (12,%sp),%sp
This is probably related to PR47815. Also y isn't really noreturn, is it? Honza? Shouldn't non-local gotos also prevent noreturn?
On Mon, 28 Feb 2011, rguenth at gcc dot gnu.org wrote: > Also y isn't really noreturn, is it? Honza? Shouldn't non-local gotos > also prevent noreturn? noreturn means a function does not return to its caller in the ordinary way; it says nothing about other control flow from that function (non-local gotos, longjmp, throwing exceptions).
I believe that the nonlocal gotos should be represented correctly in function marked noreturn. So i think noreturn function can do nonlocal goto... I get following CFG: x (int a) { struct FRAME.x FRAME.0; # BLOCK 2 freq:10000 # PRED: ENTRY [100.0%] (fallthru,exec) y (a_1(D)); [static-chain: &FRAME.0] # SUCC: 3 [100.0%] (ab,exec) # BLOCK 3 freq:10000 # PRED: 2 [100.0%] (ab,exec) <L0>: [non-local] # SUCC: 4 [100.0%] (fallthru,exec) # BLOCK 4 freq:10000 # PRED: 3 [100.0%] (fallthru,exec) xlab: return a_1(D); # SUCC: EXIT [100.0%] } this seems all right (i.e. there is abnormal edge to BB 3 for non-local goto) We need to work out what really goes wrong for m68. Honza
GCC 4.6.0 is being released, adjusting target milestone.
GCC 4.6.1 is being released.
I can trigger the bug on i386 too, if I tweak the i386 backend to push function parameters rather than moving them to pre-allocated stack space, and to disallow regparm promotion for calls to non-exported functions, via this patch: --- gcc-4.6-r160123/gcc/config/i386/i386.h.~1~ 2010-05-29 16:03:31.000000000 +0200 +++ gcc-4.6-r160123/gcc/config/i386/i386.h 2011-07-08 10:43:46.000000000 +0200 @@ -1513,13 +1513,21 @@ enum reg_class prologue and apilogue. This is not possible without ACCUMULATE_OUTGOING_ARGS. */ +#if 0 #define ACCUMULATE_OUTGOING_ARGS \ (TARGET_ACCUMULATE_OUTGOING_ARGS || ix86_cfun_abi () == MS_ABI) +#else +#define ACCUMULATE_OUTGOING_ARGS 0 +#endif /* If defined, a C expression whose value is nonzero when we want to use PUSH instructions to pass outgoing arguments. */ +#if 0 #define PUSH_ARGS (TARGET_PUSH_ARGS && !ACCUMULATE_OUTGOING_ARGS) +#else +#define PUSH_ARGS 1 +#endif /* We want the stack and args grow in opposite directions, even if PUSH_ARGS is 0. */ @@ -1804,7 +1812,11 @@ typedef struct ix86_args { #define X86_64_REGPARM_MAX 6 #define X86_64_MS_REGPARM_MAX 4 +#if 0 #define X86_32_REGPARM_MAX 3 +#else +#define X86_32_REGPARM_MAX 0 +#endif #define REGPARM_MAX \ (TARGET_64BIT ? (TARGET_64BIT_MS_ABI ? X86_64_MS_REGPARM_MAX \ With that in place gcc still bootstraps ok on i686-linux, but r160124 now miscompiles non-local-goto-4.c just like on m68k (see the changed and now incorrect stack offset in the first insn after L4): --- non-local-goto-4.s-r160123 2011-07-08 10:58:40.000000000 +0200 +++ non-local-goto-4.s-r160124 2011-07-08 11:24:05.000000000 +0200 @@ -25,19 +25,17 @@ movl %esp, 8(%esp) pushl 36(%esp) call y.1220 - popl %eax + .p2align 4,,7 + .p2align 3 +.L2: .L4: - movl 36(%esp), %eax + movl 40(%esp), %eax movl 16(%esp), %ebx movl 20(%esp), %esi movl 24(%esp), %edi movl 28(%esp), %ebp addl $32, %esp ret - .p2align 4,,7 - .p2align 3 -.L2: - jmp .L4 .size x, .-x .p2align 4,,15 .globl main When I look at dumps everything looks ok up to .188r.asmcons, but in .191r.ira I see a new load from the stack with the wrong offset.
I don't even have to tweak the i386 backend to trigger the error on i386, just make the nested function y() regparm(0) --- non-local-goto-4.c.~1~ 2011-07-07 20:04:29.000000000 +0200 +++ non-local-goto-4.c 2011-07-08 12:49:36.000000000 +0200 @@ -15,7 +15,7 @@ int x(a) { __label__ xlab; - void y(a) + void __attribute__((regparm(0))) y(a) { if (a==0) goto xlab; and compile with options to force the use of PUSH for parameter passing: > gcc -O3 -fomit-frame-pointer -fno-asynchronous-unwind-tables -mpush-args -mno-accumulate-outgoing-args non-local-goto-4.c > ./a.out Abort
This looks like a missing stack adjustment. do_pending_stack_adjust needs to be called before any potential control flow change. Maybe such a call is missing?
Julian Brown's proposed patch fixes non-local-goto-4.c on both m68k and i386: http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01657.html
GCC 4.6.2 is being released.
Author: jules Date: Fri Oct 28 10:48:32 2011 New Revision: 180611 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180611 Log: PR rtl-optimization/47918 * reload1.c (set_initial_label_offsets): Use initial offsets for labels on the nonlocal_goto_handler_labels chain. Modified: trunk/gcc/ChangeLog trunk/gcc/reload1.c
Author: jules Date: Tue Nov 1 18:38:42 2011 New Revision: 180740 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180740 Log: PR rtl-optimization/47918 * reload1.c (set_initial_label_offsets): Use initial offsets for labels on the nonlocal_goto_handler_labels chain. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/reload1.c
Presumably everywhere.