Bug 84521 - aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer
Summary: aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: ---
Assignee: Wilco
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2018-02-22 18:43 UTC by David Malcolm
Modified: 2019-06-20 15:55 UTC (History)
6 users (show)

See Also:
Host: aarch64-unknown-linux-gnu
Target: aarch64-unknown-linux-gnu
Build: aarch64-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2018-02-22 00:00:00


Attachments
Reproducer (526 bytes, text/plain)
2018-02-22 18:50 UTC, David Malcolm
Details
Failing test case (362 bytes, text/x-csrc)
2018-08-01 10:15 UTC, sudi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Malcolm 2018-02-22 18:43:08 UTC
The downstream bug report here:
  https://bugzilla.redhat.com/show_bug.cgi?id=1545239
describes a problem seen on aarch64 with gcc 8 that breaks the build of Ruby at -O1 and above.

Bisection shows that the problem started with r254815, which made -fomit-frame-pointer the default.
Jakub reported:
> It is actually much older, I get the same crash if vm.c is compiled with
> -mlittle-endian -mabi=lp64 -g -grecord-gcc-switches -O1 -Wall -Werror=format-security
> -fexceptions -fPIC -fstack-protector -fno-strict-overflow -fexcess-precision=standard -fomit-frame-pointer
> with r204770, so already GCC 4.9 behaves that way too.
> Note ruby uses -fno-omit-frame-pointer already, but only on mingw32.

The issue is that the code generated for __builtin_longjmp reads a value for x29 (the frame pointer) from the jmp_buf, but the code generated for __builtin_setjmp doesn't actually write x29 to the jmp_buf, leading to corruption of x29 when a longjmp occurs.

This corruption seems to be short-lived when -fno-omit-frame-pointer (the old default), as every function restores x29 from the stack on exit.
With the new default of -fomit-frame-pointer the corruption can survive long enough to cause crashes.

There's a lot more analysis at the downstream bug report in the URL above.

I'm about to attach a reproducer.

I'm marking this as "[8 Regression]" since although this is appears to be a long-standing bug, the change of default in r254815 exposes it by default.
Comment 1 David Malcolm 2018-02-22 18:50:22 UTC
Created attachment 43489 [details]
Reproducer

When compiled with:
  gcc -DDUMP -g -O0 -fstack-protector-strong -Wall test.c
this runs to completion, and the x29 values show the function calls/returns:
x29 = 0x7ff2977910 : main : start of main
x29 = 0x7ff29778d0 : test_2 : start of test_2
x29 = 0x7ff29776a0 : test_1 : start of test_1
x29 = 0x7ff29776a0 : test_1 : zero return
x29 = 0x7ff2977690 : uses_longjmp : in uses_longjmp
x29 = 0x7ff2977740 : test_1 : non-zero return
x29 = 0x7ff2977690 : after_longjmp : after raise
x29 = 0x7ff2977740 : test_1 : end of test_1
x29 = 0x7ff29778d0 : test_2 : end of test_2
x29 = 0x7ff2977910 : main : end of main

On adding -fomit-frame-pointer, it crashes, and the x29 values show a corruption after "uses_longjmp" which becomes a crash when the x29 value is later used:

x29 = 0x7fff4709c0 : main : start of main
x29 = 0x7fff470960 : test_2 : start of test_2
x29 = 0x7fff470960 : test_1 : start of test_1
x29 = 0x7fff470960 : test_1 : zero return
x29 = 0x7fff470720 : uses_longjmp : in uses_longjmp
x29 = 0x7fff4707d0 : test_1 : non-zero return
x29 = 0x7fff4707d0 : after_longjmp : after raise
x29 = 0x7fff4707d0 : test_1 : end of test_1
x29 = 0x7fff4707d0 : test_2 : end of test_2
*** stack smashing detected ***: ./test-O0-omit-fp terminated
Comment 2 James Greenhalgh 2018-02-22 18:51:02 UTC
It is a bug that we have changed to -fomit-frame-pointer by default for AArch64. That changes a long standing ABI decision made at the dawn of the port, and promised as a feature of the architecture. I would like to see this fixed for GCC 8.

Ramana was testing a patch to fix this and change us back to -fno-omit-frame-pointer, it (or someone else's patch achieving the same) would be appreciated as the immediate fix for this issue.

I haven't validated the longer-term problem you mention with -fomit-frame-pointer.

Ramana, can you pick this up and set us back to the appropriate default? Otherwise, I can spin a patch. We should fix this urgently, or we miss the good value that comes from whole-distribution testing.
Comment 3 Andrew Pinski 2018-02-22 19:12:58 UTC
To me any use of __builtin_setjmp/__builtin_longjmp is almost always incorrect.
Comment 4 Jakub Jelinek 2018-02-22 19:24:53 UTC
Is the requirement just for functions that contain setjmp?  If so, the backend could just force frame pointers in cfun->calls_setjmp functions.

If not, even if the default is tweaked again to be -fno-omit-frame-pointer on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer, even before that change.

The test uses __builtin_setjmp, can't reproduce it when using normal setjmp,
so is it just __builtin_setjmp that requires frame pointers?  I think we don't really have a flag about uses of __builtin_setjmp right now, but it could be added next to calls_setjmp (calls_builtin_setjmp).  Marking the function with
__builtin_setjmp with __attribute__((optimize ("no-omit-frame-pointer"))) fixes it too.
Comment 5 Jakub Jelinek 2018-02-22 19:31:19 UTC
--- gcc/config/aarch64/aarch64.c.jj	2018-02-22 09:26:12.028616476 +0100
+++ gcc/config/aarch64/aarch64.c	2018-02-22 20:23:29.449621557 +0100
@@ -7432,6 +7432,20 @@ aarch64_secondary_reload (bool in_p ATTR
   return NO_REGS;
 }
 
+/* Value should be nonzero if functions must have frame pointers.
+   Zero means the frame pointer need not be set up (and parms may
+   be accessed via the stack pointer) in functions that seem suitable.  */
+
+static bool
+aarch64_frame_pointer_required (void)
+{
+  /* __builtin_setjmp requries frame pointers.  */
+  if (cfun->calls_setjmp)
+    return true;
+
+  return false;
+}
+
 static bool
 aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
 {
@@ -17463,6 +17477,9 @@ aarch64_run_selftests (void)
 #undef TARGET_CALLEE_COPIES
 #define TARGET_CALLEE_COPIES hook_bool_CUMULATIVE_ARGS_mode_tree_bool_false
 
+#undef TARGET_FRAME_POINTER_REQUIRED
+#define TARGET_FRAME_POINTER_REQUIRED aarch64_frame_pointer_required
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE aarch64_can_eliminate
 

(completely untested) would require frame pointers for all setjmp calls, not just __builtin_setjmp.

I agree pretty much all uses of __builtin_setjmp are a bug, but somebody needs to explain it to the ruby authors.  It is even weirder because they are using the builtin with jmp_buf variable, jmp_buf is for the libc setjmp, for __builtin_setjmp I think it is just void *buf[5]; or something similar.

BTW, does __builtin_return_address really work on aarch64 without frame pointers?  Various other targets require frame pointers when cfun->machine->access_prev_frame (i.e. when SETUP_FRAME_ADDRESSES () has been used).
Comment 6 Wilco 2018-02-22 19:55:41 UTC
(In reply to Jakub Jelinek from comment #5)

> (completely untested) would require frame pointers for all setjmp calls, not
> just __builtin_setjmp.

That's the correct approach indeed, however aarch64_frame_pointer_required is no longer used, this now needs to added to aarch64_layout_frame:

  /* Force a frame chain for EH returns so the return address is at FP+8.  */
  cfun->machine->frame.emit_frame_chain
    = frame_pointer_needed || crtl->calls_eh_return;

> BTW, does __builtin_return_address really work on aarch64 without frame
> pointers?  Various other targets require frame pointers when
> cfun->machine->access_prev_frame (i.e. when SETUP_FRAME_ADDRESSES () has
> been used).

It only supports returning the return address of the current function (and even that is most likely a bug rather than useful). When level != 0 it always returns 0.
Comment 7 Jakub Jelinek 2018-02-22 20:02:19 UTC
cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover __builtin_setjmp.

aarch64_frame_pointer_required would force frame_pointer_needed and thus be true in that case too.  But sure, if it works, we can change:
   /* Force a frame chain for EH returns so the return address is at FP+8.  */
   cfun->machine->frame.emit_frame_chain
-    = frame_pointer_needed || crtl->calls_eh_return;
+    = frame_pointer_needed || crtl->calls_eh_return || cfun->has_nonlocal_label;
Comment 8 Wilco 2018-02-22 20:22:45 UTC
(In reply to Jakub Jelinek from comment #7)
> cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover
> __builtin_setjmp.
> 
> aarch64_frame_pointer_required would force frame_pointer_needed and thus be
> true in that case too.  But sure, if it works, we can change:
>    /* Force a frame chain for EH returns so the return address is at FP+8. 
> */
>    cfun->machine->frame.emit_frame_chain
> -    = frame_pointer_needed || crtl->calls_eh_return;
> +    = frame_pointer_needed || crtl->calls_eh_return ||
> cfun->has_nonlocal_label;

Note I'm not convinced this is sufficient. I tried compiling testsuite/gcc.c-torture/execute/pr60003.c and it appears to mess up the frame pointer so it no longer points to a frame chain:

baz:
	adrp	x1, .LANCHOR0
	add	x0, x1, :lo12:.LANCHOR0
	stp	x29, x30, [sp, -16]!
	mov	x29, sp
	ldr	x2, [x0, 8]
	ldr	x29, [x1, #:lo12:.LANCHOR0]  // load of bad frame pointer
	ldr	x0, [x0, 16]
	mov	sp, x0
	br	x2

foo:
	stp	x29, x30, [sp, -176]!
	adrp	x2, .LANCHOR0
	add	x1, x2, :lo12:.LANCHOR0
	mov	x29, sp
	add	x3, sp, 176                  // store of bad frame pointer
	str	x3, [x2, #:lo12:.LANCHOR0]
	stp	x19, x20, [sp, 16]
	stp	x21, x22, [sp, 32]
	stp	x23, x24, [sp, 48]
	stp	x25, x26, [sp, 64]
	stp	x27, x28, [sp, 80]
	stp	d8, d9, [sp, 96]
	stp	d10, d11, [sp, 112]
	stp	d12, d13, [sp, 128]
	stp	d14, d15, [sp, 144]
	str	w0, [sp, 172]
	adrp	x0, .L7
	add	x0, x0, :lo12:.L7
	str	x0, [x1, 8]
	mov	x0, sp
	str	x0, [x1, 16]
	bl	baz
	.p2align 2
.L7:
	ldr	w0, [sp, 172]
	ldp	x19, x20, [sp, 16]
	ldp	x21, x22, [sp, 32]
	ldp	x23, x24, [sp, 48]
	ldp	x25, x26, [sp, 64]
	ldp	x27, x28, [sp, 80]
	ldp	d8, d9, [sp, 96]
	ldp	d10, d11, [sp, 112]
	ldp	d12, d13, [sp, 128]
	ldp	d14, d15, [sp, 144]
	ldp	x29, x30, [sp], 176
	ret

What should happen is that it stores the actual sp/fp just before calling baz, and baz then restores those before jumping to L7.
Comment 9 Wilco 2018-02-22 22:18:08 UTC
(In reply to Jakub Jelinek from comment #7)
> cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover
> __builtin_setjmp.

Do non-local labels do the same odd thing? It seems to me if the mid-end automatically inserts explicit writes to the frame pointer, it should also set frame_pointer_needed. This may be a bug on other targets too.

Also a much better implementation would use a small landing pad in the function that does the __builtin_setjmp (rather than inline it a different function), so you avoid the frame pointer corruption. Eg.

baz:
        ...
	ldr	x1, [x0, 8]
	br	x1

L7_nonlocal: (landing pad in foo)
        ldr     x29, [x0, 16]
        ldr     sp,  [x0]
        b       .L7

Or maybe we should get rid of these horrible hacks altogether?
Comment 10 Ramana Radhakrishnan 2018-02-23 00:01:47 UTC
(In reply to Jakub Jelinek from comment #4)
> Is the requirement just for functions that contain setjmp?  If so, the
> backend could just force frame pointers in cfun->calls_setjmp functions.

I think we should flip back fno-omit-frame-pointer on for gcc-8 as that breaks the guarantee that we've had in the port for quite a while. I'm testing a patch currently that I will get out first thing tomorrow to turn this back on.

If we want to turn it off that should be a conscious decision.


> 
> If not, even if the default is tweaked again to be -fno-omit-frame-pointer
> on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer,
> even before that change.

I think we should treat that as a separate but related issue.


Ramana
Comment 11 Wilco 2018-02-23 00:22:51 UTC
(In reply to Ramana Radhakrishnan from comment #10)
> (In reply to Jakub Jelinek from comment #4)
> > Is the requirement just for functions that contain setjmp?  If so, the
> > backend could just force frame pointers in cfun->calls_setjmp functions.
> 
> I think we should flip back fno-omit-frame-pointer on for gcc-8 as that
> breaks the guarantee that we've had in the port for quite a while. I'm
> testing a patch currently that I will get out first thing tomorrow to turn
> this back on.
> 
> If we want to turn it off that should be a conscious decision.
> 
> 
> > 
> > If not, even if the default is tweaked again to be -fno-omit-frame-pointer
> > on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer,
> > even before that change.
> 
> I think we should treat that as a separate but related issue.
> 
> 
> Ramana

The code is clearly incorrect even with the frame pointer is enabled, so this has absolutely nothing to do with the frame pointer default. Like the eh_return builtin, the implementation of these builtins is incorrect with or without a frame pointer (and apparently has always been).
Comment 12 Wilco 2018-02-23 01:07:03 UTC
Note PR64242 is related (also frame pointer corruption by __builtin_longjmp).
Comment 13 Jakub Jelinek 2018-02-23 08:05:35 UTC
Even when you find another PR for __builtin_longjmp (clearly RA related), that doesn't mean that __builtin_{setjmp,longjmp} are totally broken and should not be fixed on aarch64.  As ruby (which for some cryptic reason has been using those for quite some time) shows, it was working properly for years in real-world cases, except on aarch64 with -fomit-frame-pointer.  AFAIK this issue is with __builtin_setjmp or what it is lowered into, not __builtin_longjmp, so isn't really related to PR64242.
__builtin_setjmp/longjmp is described as saving/restoring fewer registers than setjmp/longjmp but with the help of the prologue/epilogue of the frame containing the __builtin_setjmp call.  So if the help on aarch64 needs to be frame pointer, let's add it, if it needs to be something else, cfun->has_nonlocal_labels can be checked elsewhere too.  Don't other non-local gotos suffer a similar problem?
Comment 14 Wilco 2018-02-23 08:17:57 UTC
(In reply to Jakub Jelinek from comment #13)
> Even when you find another PR for __builtin_longjmp (clearly RA related),
> that doesn't mean that __builtin_{setjmp,longjmp} are totally broken and
> should not be fixed on aarch64.  As ruby (which for some cryptic reason has
> been using those for quite some time) shows, it was working properly for
> years in real-world cases, except on aarch64 with -fomit-frame-pointer. 
> AFAIK this issue is with __builtin_setjmp or what it is lowered into, not
> __builtin_longjmp, so isn't really related to PR64242.
> __builtin_setjmp/longjmp is described as saving/restoring fewer registers
> than setjmp/longjmp but with the help of the prologue/epilogue of the frame
> containing the __builtin_setjmp call.  So if the help on aarch64 needs to be
> frame pointer, let's add it, if it needs to be something else,
> cfun->has_nonlocal_labels can be checked elsewhere too.  Don't other
> non-local gotos suffer a similar problem?

The fact that Ruby happen to work without crashes doesn't imply that __builtin_longjmp/setjmp are 100% correct. The existing implementation breaks the AArch64 ABI since it uses the frame pointer as an argument register. Using the framepointer this way (broken frame chain) is explicitly disallowed by the ABI. On x86 it can corrupt its own framepointer, but that's possible on other targets too if you're unlucky.
Comment 15 Ramana Radhakrishnan 2018-02-26 09:25:53 UTC
Author: ramana
Date: Mon Feb 26 09:25:21 2018
New Revision: 257984

URL: https://gcc.gnu.org/viewcvs?rev=257984&root=gcc&view=rev
Log:
[Patch AArch64] Turn on frame pointer / partial fix for PR84521

This fixes a GCC-8 regression that we accidentally switched off frame
pointers in the AArch64 backend when changing the defaults in the common
parts of the code. This breaks an ABI decision that was made in GCC at
the dawn of the port with respect to having a frame pointer at all
times.  If we really want to turn this off lets have a discussion around
that separately.

For now turn this back on and I believe this will leave PR84521 latent
again with -fomit-frame-pointer and (hopefully) make the ruby issue go
away. I'm asking Sudi to pick that up.

Bootstrapped and regression tested on AArch64-none-linux-gnu but I see
one regression in gcc.c-torture/execute/960419-2.c which needs to be
looked at next (PR84528, thanks Kyrill).

Ok to put in and then look at PR84528 ?

2018-02-26  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

        PR target/84521
	* common/config/aarch64/aarch64-common.c
	(aarch_option_optimization_table[]): Switch
	off fomit-frame-pointer

2018-02-26  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	PR target/84521
	* gcc.target/aarch64/lr_free_2.c: Revert changes in
	r254814 disabling -fomit-frame-pointer by default.
	* gcc.target/aarch64/spill_1.c: Likewise.
	* gcc.target/aarch64/test_frame_11.c: Likewise.
	* gcc.target/aarch64/test_frame_12.c: Likewise.
	* gcc.target/aarch64/test_frame_13.c: Likewise.
	* gcc.target/aarch64/test_frame_14.c: Likewise.
	* gcc.target/aarch64/test_frame_15.c: Likewise.
	* gcc.target/aarch64/test_frame_3.c: Likewise.
	* gcc.target/aarch64/test_frame_5.c: Likewise.
	* gcc.target/aarch64/test_frame_9.c: Likewise.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/common/config/aarch64/aarch64-common.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
    trunk/gcc/testsuite/gcc.target/aarch64/spill_1.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_11.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_12.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_13.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_14.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_15.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_3.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_5.c
    trunk/gcc/testsuite/gcc.target/aarch64/test_frame_9.c
Comment 16 sudi 2018-03-06 18:28:29 UTC
So I think I would go with Jakub's suggestion of defining calls_builtin_setjmp and use that in aarch64_layout_frame for cfun->machine->frame.emit_frame_chain. I am still investigating Wilco's concern over pr60003.c

Also I would like to ask if we can pick up the discussions on PR59039 and on the documentation patch. Some definition would be a lot more helpful for someone like me who had no idea about __builtin_setjmp/longjmp to wrap my head around what it actually is.

The macro DONT_USE_BUITIN_SETJMP made me very confused for a while until Ramana pointed it out to me that its only used in the context of compiler's exception unwinding machinery and does not apply to a case like this where the user calls the builtin function. With this context in head when I went back and read the document, it made sense. Can we possibly tweak the wording to make it clearer?
Comment 17 sudi 2018-03-07 18:59:57 UTC
I looked up what other targets were doing and one thing found to be interesting was that a lot of them are defining the target hook TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define it to return the hard frame pointer. That seems to solve the issue with both the attached test case and the test that Wilco mentioned.

Does this look like it solves "mid-end versus back-end : who fixes this issue" problem?

I am still pretty new to knowing how the stack should actually look. So calling for some help!

Sudi
Comment 18 ramana.radhakrishnan 2018-03-08 12:20:39 UTC
On 07/03/2018 18:59, sudi at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521
> 
> --- Comment #17 from sudi at gcc dot gnu.org ---
> I looked up what other targets were doing and one thing found to be interesting
> was that a lot of them are defining the target hook
> TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define it
> to return the hard frame pointer. That seems to solve the issue with both the
> attached test case and the test that Wilco mentioned.
> 
> Does this look like it solves "mid-end versus back-end : who fixes this issue"
> problem?
> 
> I am still pretty new to knowing how the stack should actually look. So calling
> for some help!
> 
> Sudi
> 

That looks sensible. Especially see the comment in arc/arc.c - that
seems to mirror the decision we want in AArch64 as well. The logic /
comment around this in the arc port seems quite reasonable to me and
looks like what we can have on AArch64 as well.

Ramana
Comment 19 Wilco 2018-03-08 15:58:26 UTC
(In reply to sudi from comment #17)
> I looked up what other targets were doing and one thing found to be
> interesting was that a lot of them are defining the target hook
> TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define
> it to return the hard frame pointer. That seems to solve the issue with both
> the attached test case and the test that Wilco mentioned.
> 
> Does this look like it solves "mid-end versus back-end : who fixes this
> issue" problem?
> 
> I am still pretty new to knowing how the stack should actually look. So
> calling for some help!

This is not about stack layout but ensuring we don't corrupt the frame pointer or the stack pointer. Any changes to SP/FP are inherently extremely risky and need to be done exactly right or lots of things will fail.

There are too many issues to make a full list, but here are a few obvious ones:

1. The TARGET_BUILTIN_SETJMP_FRAME_VALUE only fixes the store of the frame pointer, however it does not fix the landing pad, I see code like this:

	str	x29, [x3, #:lo12:.LANCHOR0]
	mov	x0, sp
	stp	x1, x0, [x2, 8]
	bl	baz
	.p2align 2
.L7:
	sub	x29, x29, #176
	ldr	w0, [x29, 172]
	mov	sp, x29

So now the store is correct, but when returning at L2 we corrupt the frame pointer badly (and then the stack pointer as well)... This is basically the same as pr60003.c but with an alloca added.

2. We still need to force the frame pointer to be saved, TARGET_BUILTIN_SETJMP_FRAME_VALUE doesn't fix that.

3. __builtin_longjmp is still incorrect, like I showed in PR64242, it also fails on AArch64:

#include <setjmp.h>
#include <string.h>
void bug (jmp_buf buf)
{
  jmp_buf buf2;
  memcpy (buf2, buf, sizeof (jmp_buf));
  __builtin_longjmp (buf2, 1);
}

bug:
	stp	x29, x30, [sp, -336]!
	mov	x1, x0
	mov	x2, 312
	mov	x29, sp
	add	x0, x29, 24
	bl	memcpy
	ldr	x0, [x29, 32]
	ldr	x29, [x29, 24]  // oops
	ldr	x1, [x29, 40]
	mov	sp, x1
	br	x0

4. Trying to use setjmp and longjmp in the same function often gives this ICE:

during RTL pass: cse1
setjmp.c: In function ‘bug1’:
setjmp.c:86:1: internal compiler error: in mark_jump_label_1, at jump.c:1151
 }
 ^
0xa598f7 mark_jump_label_1
	/home/wdijkstr/gcc/gcc/jump.c:1151
0xa59a6b mark_jump_label_1
	/home/wdijkstr/gcc/gcc/jump.c:1211
0xa59a6b mark_jump_label_1
	/home/wdijkstr/gcc/gcc/jump.c:1211
0xa59e17 mark_all_labels
	/home/wdijkstr/gcc/gcc/jump.c:305
0xa59e17 rebuild_jump_labels_1
	/home/wdijkstr/gcc/gcc/jump.c:74
0x128ce23 rest_of_handle_cse
	/home/wdijkstr/gcc/gcc/cse.c:7628
0x128ce23 execute
	/home/wdijkstr/gcc/gcc/cse.c:7662

5. Things get really interesting on ARM and Thumb-2 where there are multiple frame pointer registers. This shows again why it is wrong to change the sp/fp in the longjmp - for it to work correctly it must be done in the landing pad in the function that has the setjmp.
Comment 20 Jakub Jelinek 2018-03-08 16:14:24 UTC
You can use __builtin_setjmp and __builtin_longjmp in the same function, only if they use a different buffer.  Otherwise it is invalid.  So, while we shouldn't ICE on invalid code, not sure why you are listing 4 here.

CCing Eric for the other __builtin_setjmp related issues.
Comment 21 Eric Botcazou 2018-03-08 16:33:03 UTC
> You can use __builtin_setjmp and __builtin_longjmp in the same function,
> only if they use a different buffer.  Otherwise it is invalid.

Yes, that's invalid.

> CCing Eric for the other __builtin_setjmp related issues.

I don't see anything special with Aarch64 that would make it impossible to support __builtin_setjmp/__builtin_longjmp properly.  You can parameterize a lot of things in the back-end with TARGET_BUILTIN_SETJMP_FRAME_VALUE and various patterns like:
  save_stack_nonlocal
  restore_stack_nonlocal,
  nonlocal_goto_receiver,
  builtin_setjmp_setup,
  builtin_setjmp_receiver,
  builtin_longjmp
See the documentation and existing ports.
Comment 22 Wilco 2018-03-08 22:43:21 UTC
(In reply to Eric Botcazou from comment #21)
> > You can use __builtin_setjmp and __builtin_longjmp in the same function,
> > only if they use a different buffer.  Otherwise it is invalid.
> 
> Yes, that's invalid.

Where exactly is this documented? I can't find no references to __builtin_setjmp or __builtin_longjmp in https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc.pdf or https://gcc.gnu.org/onlinedocs/gccint.pdf.

> > CCing Eric for the other __builtin_setjmp related issues.
> 
> I don't see anything special with Aarch64 that would make it impossible to
> support __builtin_setjmp/__builtin_longjmp properly.  You can parameterize a
> lot of things in the back-end with TARGET_BUILTIN_SETJMP_FRAME_VALUE and
> various patterns like:
>   save_stack_nonlocal
>   restore_stack_nonlocal,
>   nonlocal_goto_receiver,
>   builtin_setjmp_setup,
>   builtin_setjmp_receiver,
>   builtin_longjmp
> See the documentation and existing ports.

Which documentation? Like you say, AArch64 doesn't need anything special, it's the mid-end implementation that is broken here. I bet most of my examples fail on other targets too.
Comment 23 Eric Botcazou 2018-03-08 23:27:27 UTC
> Where exactly is this documented? I can't find no references to
> __builtin_setjmp or __builtin_longjmp in
> https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc.pdf or
> https://gcc.gnu.org/onlinedocs/gccint.pdf.

__builtin_setjmp/__builtin_longjmp is not documented at all.  You can probably find a discussion in the archives or bugzilla about this.

> Which documentation?

The internal manual, section Machine Description.

> Like you say, AArch64 doesn't need anything special, it's the mid-end 
> implementation that is broken here. I bet most of my examples fail on other
> targets too.

Well, the SJLJ exception handling scheme is piggybacked on this support and has worked for the past 2 decades.  And, until very recently, the Ada compiler was using another exception handling scheme even more directly piggybacked on it and which had worked for the past 2 decades too.
Comment 24 Jakub Jelinek 2018-03-13 09:25:31 UTC
With the change of the default this actually isn't a regression anymore, so not a release blocker.  That doesn't mean a simple fix, e.g. __builtin_setjmp forcing frame pointer even if not otherwise needed, can't be accepted for GCC8 (as an exception), but it is unlikely right time for finding all issues with the builtin and trying to fix all of them.  That can be done in GCC9.
Comment 25 sudi 2018-03-14 17:41:59 UTC
Proposed patch. This obviously does not solve all the issues

https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00668.html
Comment 26 Jakub Jelinek 2018-05-02 10:08:04 UTC
GCC 8.1 has been released.
Comment 27 Jakub Jelinek 2018-07-26 11:21:34 UTC
GCC 8.2 has been released.
Comment 28 sudi 2018-08-01 10:15:05 UTC
Created attachment 44478 [details]
Failing test case

As advised by James on the mailing list, I am adding the test case that is failing on at least AAcrh64 and x86. My proposed patch fixes this on AArch64, but I would like to add this test in the general gcc.c-torture/execute/ folder so that other targets can also check.
Comment 29 Wilco 2019-06-19 12:53:16 UTC
Author: wilco
Date: Wed Jun 19 12:52:43 2019
New Revision: 272473

URL: https://gcc.gnu.org/viewcvs?rev=272473&root=gcc&view=rev
Log:
Simplify setjmp and non-local goto implementation (PR84521)

This fixes and simplifies the setjmp and non-local goto implementation.
Currently the virtual frame pointer is saved when using __builtin_setjmp or
a non-local goto.  Depending on whether a frame pointer is used, this may
either save SP or FP with an immediate offset.  However the goto or longjmp
always updates the hard frame pointer.

A receiver veneer in the original function then assigns the hard frame pointer
to the virtual frame pointer, which should, if it works correctly, again assign
SP or FP.  However the special elimination code in eliminate_regs_in_insn
doesn't do this correctly unless the frame pointer is used, and even if it
worked by writing SP, the frame pointer would still be corrupted.

A much simpler implementation is to always save and restore the hard frame
pointer.  This avoids 2 redundant instructions which add/subtract the virtual
frame offset.  A large amount of code can be removed as a result, including all
implementations of TARGET_BUILTIN_SETJMP_FRAME_VALUE (all of which already use
the hard frame pointer).  The expansion of nonlocal_goto on PA can be simplied
to just restore the hard frame pointer. 

This fixes the most obvious issues, however there are still issues on targets
which define HARD_FRAME_POINTER_IS_FRAME_POINTER (arm, mips).
Each function could have a different hard frame pointer, so a non-local goto
may restore the wrong frame pointer (TARGET_BUILTIN_SETJMP_FRAME_VALUE could
be useful for this).

The i386 TARGET_BUILTIN_SETJMP_FRAME_VALUE was incorrect: if stack_realign_fp
is true, it would save the hard frame pointer value but restore the virtual
frame pointer which according to ix86_initial_elimination_offset can have a
non-zero offset from the hard frame pointer.

The ia64 implementation of nonlocal_goto seems incorrect since the helper
function moves the the frame pointer value into the static chain register
(so this patch does nothing to make it better or worse).

AArch64 + x86-64 bootstrap OK, new test passes on AArch64, x86-64 and Arm.

gcc/
	PR middle-end/84521
	* builtins.c (expand_builtin_setjmp_setup): Save
	hard_frame_pointer_rtx.
	(expand_builtin_setjmp_receiver): Do not emit sfp = fp move since we
	restore fp.
	* function.c (expand_function_start): Save hard_frame_pointer_rtx for
	non-local goto.
	* lra-eliminations.c (eliminate_regs_in_insn): Remove sfp = fp
	elimination code.
	(remove_reg_equal_offset_note): Remove unused function.
	* reload1.c (eliminate_regs_in_insn): Remove sfp = hfp elimination
	code.
	* config/arc/arc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
	(arc_builtin_setjmp_frame_value): Remove function.
	* config/avr/avr.c  (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
	(avr_builtin_setjmp_frame_value): Remove function.
	* config/i386/i386.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
	(ix86_builtin_setjmp_frame_value): Remove function.
	* config/pa/pa.md (nonlocal_goto): Remove FP adjustment.
	* config/sparc/sparc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
	(sparc_builtin_setjmp_frame_value): Remove function.
	* config/vax/vax.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
	(vax_builtin_setjmp_frame_value): Remove function.
	* config/xtensa/xtensa.c (xtensa_frame_pointer_required): Force frame
	pointer	if has_nonlocal_label.

testsuite/
	PR middle-end/84521
	* gcc.c-torture/execute/pr84521.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr84521.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/builtins.c
    trunk/gcc/config/arc/arc.c
    trunk/gcc/config/avr/avr.c
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/pa/pa.md
    trunk/gcc/config/sparc/sparc.c
    trunk/gcc/config/vax/vax.c
    trunk/gcc/config/xtensa/xtensa.c
    trunk/gcc/function.c
    trunk/gcc/lra-eliminations.c
    trunk/gcc/reload1.c
    trunk/gcc/testsuite/ChangeLog