The downstream bug report here: https://bugzilla.redhat.com/show_bug.cgi?id=1545239 describes a problem seen on aarch64 with gcc 8 that breaks the build of Ruby at -O1 and above. Bisection shows that the problem started with r254815, which made -fomit-frame-pointer the default. Jakub reported: > It is actually much older, I get the same crash if vm.c is compiled with > -mlittle-endian -mabi=lp64 -g -grecord-gcc-switches -O1 -Wall -Werror=format-security > -fexceptions -fPIC -fstack-protector -fno-strict-overflow -fexcess-precision=standard -fomit-frame-pointer > with r204770, so already GCC 4.9 behaves that way too. > Note ruby uses -fno-omit-frame-pointer already, but only on mingw32. The issue is that the code generated for __builtin_longjmp reads a value for x29 (the frame pointer) from the jmp_buf, but the code generated for __builtin_setjmp doesn't actually write x29 to the jmp_buf, leading to corruption of x29 when a longjmp occurs. This corruption seems to be short-lived when -fno-omit-frame-pointer (the old default), as every function restores x29 from the stack on exit. With the new default of -fomit-frame-pointer the corruption can survive long enough to cause crashes. There's a lot more analysis at the downstream bug report in the URL above. I'm about to attach a reproducer. I'm marking this as "[8 Regression]" since although this is appears to be a long-standing bug, the change of default in r254815 exposes it by default.
Created attachment 43489 [details] Reproducer When compiled with: gcc -DDUMP -g -O0 -fstack-protector-strong -Wall test.c this runs to completion, and the x29 values show the function calls/returns: x29 = 0x7ff2977910 : main : start of main x29 = 0x7ff29778d0 : test_2 : start of test_2 x29 = 0x7ff29776a0 : test_1 : start of test_1 x29 = 0x7ff29776a0 : test_1 : zero return x29 = 0x7ff2977690 : uses_longjmp : in uses_longjmp x29 = 0x7ff2977740 : test_1 : non-zero return x29 = 0x7ff2977690 : after_longjmp : after raise x29 = 0x7ff2977740 : test_1 : end of test_1 x29 = 0x7ff29778d0 : test_2 : end of test_2 x29 = 0x7ff2977910 : main : end of main On adding -fomit-frame-pointer, it crashes, and the x29 values show a corruption after "uses_longjmp" which becomes a crash when the x29 value is later used: x29 = 0x7fff4709c0 : main : start of main x29 = 0x7fff470960 : test_2 : start of test_2 x29 = 0x7fff470960 : test_1 : start of test_1 x29 = 0x7fff470960 : test_1 : zero return x29 = 0x7fff470720 : uses_longjmp : in uses_longjmp x29 = 0x7fff4707d0 : test_1 : non-zero return x29 = 0x7fff4707d0 : after_longjmp : after raise x29 = 0x7fff4707d0 : test_1 : end of test_1 x29 = 0x7fff4707d0 : test_2 : end of test_2 *** stack smashing detected ***: ./test-O0-omit-fp terminated
It is a bug that we have changed to -fomit-frame-pointer by default for AArch64. That changes a long standing ABI decision made at the dawn of the port, and promised as a feature of the architecture. I would like to see this fixed for GCC 8. Ramana was testing a patch to fix this and change us back to -fno-omit-frame-pointer, it (or someone else's patch achieving the same) would be appreciated as the immediate fix for this issue. I haven't validated the longer-term problem you mention with -fomit-frame-pointer. Ramana, can you pick this up and set us back to the appropriate default? Otherwise, I can spin a patch. We should fix this urgently, or we miss the good value that comes from whole-distribution testing.
To me any use of __builtin_setjmp/__builtin_longjmp is almost always incorrect.
Is the requirement just for functions that contain setjmp? If so, the backend could just force frame pointers in cfun->calls_setjmp functions. If not, even if the default is tweaked again to be -fno-omit-frame-pointer on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer, even before that change. The test uses __builtin_setjmp, can't reproduce it when using normal setjmp, so is it just __builtin_setjmp that requires frame pointers? I think we don't really have a flag about uses of __builtin_setjmp right now, but it could be added next to calls_setjmp (calls_builtin_setjmp). Marking the function with __builtin_setjmp with __attribute__((optimize ("no-omit-frame-pointer"))) fixes it too.
--- gcc/config/aarch64/aarch64.c.jj 2018-02-22 09:26:12.028616476 +0100 +++ gcc/config/aarch64/aarch64.c 2018-02-22 20:23:29.449621557 +0100 @@ -7432,6 +7432,20 @@ aarch64_secondary_reload (bool in_p ATTR return NO_REGS; } +/* Value should be nonzero if functions must have frame pointers. + Zero means the frame pointer need not be set up (and parms may + be accessed via the stack pointer) in functions that seem suitable. */ + +static bool +aarch64_frame_pointer_required (void) +{ + /* __builtin_setjmp requries frame pointers. */ + if (cfun->calls_setjmp) + return true; + + return false; +} + static bool aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to) { @@ -17463,6 +17477,9 @@ aarch64_run_selftests (void) #undef TARGET_CALLEE_COPIES #define TARGET_CALLEE_COPIES hook_bool_CUMULATIVE_ARGS_mode_tree_bool_false +#undef TARGET_FRAME_POINTER_REQUIRED +#define TARGET_FRAME_POINTER_REQUIRED aarch64_frame_pointer_required + #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE aarch64_can_eliminate (completely untested) would require frame pointers for all setjmp calls, not just __builtin_setjmp. I agree pretty much all uses of __builtin_setjmp are a bug, but somebody needs to explain it to the ruby authors. It is even weirder because they are using the builtin with jmp_buf variable, jmp_buf is for the libc setjmp, for __builtin_setjmp I think it is just void *buf[5]; or something similar. BTW, does __builtin_return_address really work on aarch64 without frame pointers? Various other targets require frame pointers when cfun->machine->access_prev_frame (i.e. when SETUP_FRAME_ADDRESSES () has been used).
(In reply to Jakub Jelinek from comment #5) > (completely untested) would require frame pointers for all setjmp calls, not > just __builtin_setjmp. That's the correct approach indeed, however aarch64_frame_pointer_required is no longer used, this now needs to added to aarch64_layout_frame: /* Force a frame chain for EH returns so the return address is at FP+8. */ cfun->machine->frame.emit_frame_chain = frame_pointer_needed || crtl->calls_eh_return; > BTW, does __builtin_return_address really work on aarch64 without frame > pointers? Various other targets require frame pointers when > cfun->machine->access_prev_frame (i.e. when SETUP_FRAME_ADDRESSES () has > been used). It only supports returning the return address of the current function (and even that is most likely a bug rather than useful). When level != 0 it always returns 0.
cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover __builtin_setjmp. aarch64_frame_pointer_required would force frame_pointer_needed and thus be true in that case too. But sure, if it works, we can change: /* Force a frame chain for EH returns so the return address is at FP+8. */ cfun->machine->frame.emit_frame_chain - = frame_pointer_needed || crtl->calls_eh_return; + = frame_pointer_needed || crtl->calls_eh_return || cfun->has_nonlocal_label;
(In reply to Jakub Jelinek from comment #7) > cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover > __builtin_setjmp. > > aarch64_frame_pointer_required would force frame_pointer_needed and thus be > true in that case too. But sure, if it works, we can change: > /* Force a frame chain for EH returns so the return address is at FP+8. > */ > cfun->machine->frame.emit_frame_chain > - = frame_pointer_needed || crtl->calls_eh_return; > + = frame_pointer_needed || crtl->calls_eh_return || > cfun->has_nonlocal_label; Note I'm not convinced this is sufficient. I tried compiling testsuite/gcc.c-torture/execute/pr60003.c and it appears to mess up the frame pointer so it no longer points to a frame chain: baz: adrp x1, .LANCHOR0 add x0, x1, :lo12:.LANCHOR0 stp x29, x30, [sp, -16]! mov x29, sp ldr x2, [x0, 8] ldr x29, [x1, #:lo12:.LANCHOR0] // load of bad frame pointer ldr x0, [x0, 16] mov sp, x0 br x2 foo: stp x29, x30, [sp, -176]! adrp x2, .LANCHOR0 add x1, x2, :lo12:.LANCHOR0 mov x29, sp add x3, sp, 176 // store of bad frame pointer str x3, [x2, #:lo12:.LANCHOR0] stp x19, x20, [sp, 16] stp x21, x22, [sp, 32] stp x23, x24, [sp, 48] stp x25, x26, [sp, 64] stp x27, x28, [sp, 80] stp d8, d9, [sp, 96] stp d10, d11, [sp, 112] stp d12, d13, [sp, 128] stp d14, d15, [sp, 144] str w0, [sp, 172] adrp x0, .L7 add x0, x0, :lo12:.L7 str x0, [x1, 8] mov x0, sp str x0, [x1, 16] bl baz .p2align 2 .L7: ldr w0, [sp, 172] ldp x19, x20, [sp, 16] ldp x21, x22, [sp, 32] ldp x23, x24, [sp, 48] ldp x25, x26, [sp, 64] ldp x27, x28, [sp, 80] ldp d8, d9, [sp, 96] ldp d10, d11, [sp, 112] ldp d12, d13, [sp, 128] ldp d14, d15, [sp, 144] ldp x29, x30, [sp], 176 ret What should happen is that it stores the actual sp/fp just before calling baz, and baz then restores those before jumping to L7.
(In reply to Jakub Jelinek from comment #7) > cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover > __builtin_setjmp. Do non-local labels do the same odd thing? It seems to me if the mid-end automatically inserts explicit writes to the frame pointer, it should also set frame_pointer_needed. This may be a bug on other targets too. Also a much better implementation would use a small landing pad in the function that does the __builtin_setjmp (rather than inline it a different function), so you avoid the frame pointer corruption. Eg. baz: ... ldr x1, [x0, 8] br x1 L7_nonlocal: (landing pad in foo) ldr x29, [x0, 16] ldr sp, [x0] b .L7 Or maybe we should get rid of these horrible hacks altogether?
(In reply to Jakub Jelinek from comment #4) > Is the requirement just for functions that contain setjmp? If so, the > backend could just force frame pointers in cfun->calls_setjmp functions. I think we should flip back fno-omit-frame-pointer on for gcc-8 as that breaks the guarantee that we've had in the port for quite a while. I'm testing a patch currently that I will get out first thing tomorrow to turn this back on. If we want to turn it off that should be a conscious decision. > > If not, even if the default is tweaked again to be -fno-omit-frame-pointer > on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer, > even before that change. I think we should treat that as a separate but related issue. Ramana
(In reply to Ramana Radhakrishnan from comment #10) > (In reply to Jakub Jelinek from comment #4) > > Is the requirement just for functions that contain setjmp? If so, the > > backend could just force frame pointers in cfun->calls_setjmp functions. > > I think we should flip back fno-omit-frame-pointer on for gcc-8 as that > breaks the guarantee that we've had in the port for quite a while. I'm > testing a patch currently that I will get out first thing tomorrow to turn > this back on. > > If we want to turn it off that should be a conscious decision. > > > > > > If not, even if the default is tweaked again to be -fno-omit-frame-pointer > > on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer, > > even before that change. > > I think we should treat that as a separate but related issue. > > > Ramana The code is clearly incorrect even with the frame pointer is enabled, so this has absolutely nothing to do with the frame pointer default. Like the eh_return builtin, the implementation of these builtins is incorrect with or without a frame pointer (and apparently has always been).
Note PR64242 is related (also frame pointer corruption by __builtin_longjmp).
Even when you find another PR for __builtin_longjmp (clearly RA related), that doesn't mean that __builtin_{setjmp,longjmp} are totally broken and should not be fixed on aarch64. As ruby (which for some cryptic reason has been using those for quite some time) shows, it was working properly for years in real-world cases, except on aarch64 with -fomit-frame-pointer. AFAIK this issue is with __builtin_setjmp or what it is lowered into, not __builtin_longjmp, so isn't really related to PR64242. __builtin_setjmp/longjmp is described as saving/restoring fewer registers than setjmp/longjmp but with the help of the prologue/epilogue of the frame containing the __builtin_setjmp call. So if the help on aarch64 needs to be frame pointer, let's add it, if it needs to be something else, cfun->has_nonlocal_labels can be checked elsewhere too. Don't other non-local gotos suffer a similar problem?
(In reply to Jakub Jelinek from comment #13) > Even when you find another PR for __builtin_longjmp (clearly RA related), > that doesn't mean that __builtin_{setjmp,longjmp} are totally broken and > should not be fixed on aarch64. As ruby (which for some cryptic reason has > been using those for quite some time) shows, it was working properly for > years in real-world cases, except on aarch64 with -fomit-frame-pointer. > AFAIK this issue is with __builtin_setjmp or what it is lowered into, not > __builtin_longjmp, so isn't really related to PR64242. > __builtin_setjmp/longjmp is described as saving/restoring fewer registers > than setjmp/longjmp but with the help of the prologue/epilogue of the frame > containing the __builtin_setjmp call. So if the help on aarch64 needs to be > frame pointer, let's add it, if it needs to be something else, > cfun->has_nonlocal_labels can be checked elsewhere too. Don't other > non-local gotos suffer a similar problem? The fact that Ruby happen to work without crashes doesn't imply that __builtin_longjmp/setjmp are 100% correct. The existing implementation breaks the AArch64 ABI since it uses the frame pointer as an argument register. Using the framepointer this way (broken frame chain) is explicitly disallowed by the ABI. On x86 it can corrupt its own framepointer, but that's possible on other targets too if you're unlucky.
Author: ramana Date: Mon Feb 26 09:25:21 2018 New Revision: 257984 URL: https://gcc.gnu.org/viewcvs?rev=257984&root=gcc&view=rev Log: [Patch AArch64] Turn on frame pointer / partial fix for PR84521 This fixes a GCC-8 regression that we accidentally switched off frame pointers in the AArch64 backend when changing the defaults in the common parts of the code. This breaks an ABI decision that was made in GCC at the dawn of the port with respect to having a frame pointer at all times. If we really want to turn this off lets have a discussion around that separately. For now turn this back on and I believe this will leave PR84521 latent again with -fomit-frame-pointer and (hopefully) make the ruby issue go away. I'm asking Sudi to pick that up. Bootstrapped and regression tested on AArch64-none-linux-gnu but I see one regression in gcc.c-torture/execute/960419-2.c which needs to be looked at next (PR84528, thanks Kyrill). Ok to put in and then look at PR84528 ? 2018-02-26 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/84521 * common/config/aarch64/aarch64-common.c (aarch_option_optimization_table[]): Switch off fomit-frame-pointer 2018-02-26 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/84521 * gcc.target/aarch64/lr_free_2.c: Revert changes in r254814 disabling -fomit-frame-pointer by default. * gcc.target/aarch64/spill_1.c: Likewise. * gcc.target/aarch64/test_frame_11.c: Likewise. * gcc.target/aarch64/test_frame_12.c: Likewise. * gcc.target/aarch64/test_frame_13.c: Likewise. * gcc.target/aarch64/test_frame_14.c: Likewise. * gcc.target/aarch64/test_frame_15.c: Likewise. * gcc.target/aarch64/test_frame_3.c: Likewise. * gcc.target/aarch64/test_frame_5.c: Likewise. * gcc.target/aarch64/test_frame_9.c: Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/common/config/aarch64/aarch64-common.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/lr_free_2.c trunk/gcc/testsuite/gcc.target/aarch64/spill_1.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_11.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_12.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_13.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_14.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_15.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_3.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_5.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_9.c
So I think I would go with Jakub's suggestion of defining calls_builtin_setjmp and use that in aarch64_layout_frame for cfun->machine->frame.emit_frame_chain. I am still investigating Wilco's concern over pr60003.c Also I would like to ask if we can pick up the discussions on PR59039 and on the documentation patch. Some definition would be a lot more helpful for someone like me who had no idea about __builtin_setjmp/longjmp to wrap my head around what it actually is. The macro DONT_USE_BUITIN_SETJMP made me very confused for a while until Ramana pointed it out to me that its only used in the context of compiler's exception unwinding machinery and does not apply to a case like this where the user calls the builtin function. With this context in head when I went back and read the document, it made sense. Can we possibly tweak the wording to make it clearer?
I looked up what other targets were doing and one thing found to be interesting was that a lot of them are defining the target hook TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define it to return the hard frame pointer. That seems to solve the issue with both the attached test case and the test that Wilco mentioned. Does this look like it solves "mid-end versus back-end : who fixes this issue" problem? I am still pretty new to knowing how the stack should actually look. So calling for some help! Sudi
On 07/03/2018 18:59, sudi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521 > > --- Comment #17 from sudi at gcc dot gnu.org --- > I looked up what other targets were doing and one thing found to be interesting > was that a lot of them are defining the target hook > TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define it > to return the hard frame pointer. That seems to solve the issue with both the > attached test case and the test that Wilco mentioned. > > Does this look like it solves "mid-end versus back-end : who fixes this issue" > problem? > > I am still pretty new to knowing how the stack should actually look. So calling > for some help! > > Sudi > That looks sensible. Especially see the comment in arc/arc.c - that seems to mirror the decision we want in AArch64 as well. The logic / comment around this in the arc port seems quite reasonable to me and looks like what we can have on AArch64 as well. Ramana
(In reply to sudi from comment #17) > I looked up what other targets were doing and one thing found to be > interesting was that a lot of them are defining the target hook > TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define > it to return the hard frame pointer. That seems to solve the issue with both > the attached test case and the test that Wilco mentioned. > > Does this look like it solves "mid-end versus back-end : who fixes this > issue" problem? > > I am still pretty new to knowing how the stack should actually look. So > calling for some help! This is not about stack layout but ensuring we don't corrupt the frame pointer or the stack pointer. Any changes to SP/FP are inherently extremely risky and need to be done exactly right or lots of things will fail. There are too many issues to make a full list, but here are a few obvious ones: 1. The TARGET_BUILTIN_SETJMP_FRAME_VALUE only fixes the store of the frame pointer, however it does not fix the landing pad, I see code like this: str x29, [x3, #:lo12:.LANCHOR0] mov x0, sp stp x1, x0, [x2, 8] bl baz .p2align 2 .L7: sub x29, x29, #176 ldr w0, [x29, 172] mov sp, x29 So now the store is correct, but when returning at L2 we corrupt the frame pointer badly (and then the stack pointer as well)... This is basically the same as pr60003.c but with an alloca added. 2. We still need to force the frame pointer to be saved, TARGET_BUILTIN_SETJMP_FRAME_VALUE doesn't fix that. 3. __builtin_longjmp is still incorrect, like I showed in PR64242, it also fails on AArch64: #include <setjmp.h> #include <string.h> void bug (jmp_buf buf) { jmp_buf buf2; memcpy (buf2, buf, sizeof (jmp_buf)); __builtin_longjmp (buf2, 1); } bug: stp x29, x30, [sp, -336]! mov x1, x0 mov x2, 312 mov x29, sp add x0, x29, 24 bl memcpy ldr x0, [x29, 32] ldr x29, [x29, 24] // oops ldr x1, [x29, 40] mov sp, x1 br x0 4. Trying to use setjmp and longjmp in the same function often gives this ICE: during RTL pass: cse1 setjmp.c: In function ‘bug1’: setjmp.c:86:1: internal compiler error: in mark_jump_label_1, at jump.c:1151 } ^ 0xa598f7 mark_jump_label_1 /home/wdijkstr/gcc/gcc/jump.c:1151 0xa59a6b mark_jump_label_1 /home/wdijkstr/gcc/gcc/jump.c:1211 0xa59a6b mark_jump_label_1 /home/wdijkstr/gcc/gcc/jump.c:1211 0xa59e17 mark_all_labels /home/wdijkstr/gcc/gcc/jump.c:305 0xa59e17 rebuild_jump_labels_1 /home/wdijkstr/gcc/gcc/jump.c:74 0x128ce23 rest_of_handle_cse /home/wdijkstr/gcc/gcc/cse.c:7628 0x128ce23 execute /home/wdijkstr/gcc/gcc/cse.c:7662 5. Things get really interesting on ARM and Thumb-2 where there are multiple frame pointer registers. This shows again why it is wrong to change the sp/fp in the longjmp - for it to work correctly it must be done in the landing pad in the function that has the setjmp.
You can use __builtin_setjmp and __builtin_longjmp in the same function, only if they use a different buffer. Otherwise it is invalid. So, while we shouldn't ICE on invalid code, not sure why you are listing 4 here. CCing Eric for the other __builtin_setjmp related issues.
> You can use __builtin_setjmp and __builtin_longjmp in the same function, > only if they use a different buffer. Otherwise it is invalid. Yes, that's invalid. > CCing Eric for the other __builtin_setjmp related issues. I don't see anything special with Aarch64 that would make it impossible to support __builtin_setjmp/__builtin_longjmp properly. You can parameterize a lot of things in the back-end with TARGET_BUILTIN_SETJMP_FRAME_VALUE and various patterns like: save_stack_nonlocal restore_stack_nonlocal, nonlocal_goto_receiver, builtin_setjmp_setup, builtin_setjmp_receiver, builtin_longjmp See the documentation and existing ports.
(In reply to Eric Botcazou from comment #21) > > You can use __builtin_setjmp and __builtin_longjmp in the same function, > > only if they use a different buffer. Otherwise it is invalid. > > Yes, that's invalid. Where exactly is this documented? I can't find no references to __builtin_setjmp or __builtin_longjmp in https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc.pdf or https://gcc.gnu.org/onlinedocs/gccint.pdf. > > CCing Eric for the other __builtin_setjmp related issues. > > I don't see anything special with Aarch64 that would make it impossible to > support __builtin_setjmp/__builtin_longjmp properly. You can parameterize a > lot of things in the back-end with TARGET_BUILTIN_SETJMP_FRAME_VALUE and > various patterns like: > save_stack_nonlocal > restore_stack_nonlocal, > nonlocal_goto_receiver, > builtin_setjmp_setup, > builtin_setjmp_receiver, > builtin_longjmp > See the documentation and existing ports. Which documentation? Like you say, AArch64 doesn't need anything special, it's the mid-end implementation that is broken here. I bet most of my examples fail on other targets too.
> Where exactly is this documented? I can't find no references to > __builtin_setjmp or __builtin_longjmp in > https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc.pdf or > https://gcc.gnu.org/onlinedocs/gccint.pdf. __builtin_setjmp/__builtin_longjmp is not documented at all. You can probably find a discussion in the archives or bugzilla about this. > Which documentation? The internal manual, section Machine Description. > Like you say, AArch64 doesn't need anything special, it's the mid-end > implementation that is broken here. I bet most of my examples fail on other > targets too. Well, the SJLJ exception handling scheme is piggybacked on this support and has worked for the past 2 decades. And, until very recently, the Ada compiler was using another exception handling scheme even more directly piggybacked on it and which had worked for the past 2 decades too.
With the change of the default this actually isn't a regression anymore, so not a release blocker. That doesn't mean a simple fix, e.g. __builtin_setjmp forcing frame pointer even if not otherwise needed, can't be accepted for GCC8 (as an exception), but it is unlikely right time for finding all issues with the builtin and trying to fix all of them. That can be done in GCC9.
Proposed patch. This obviously does not solve all the issues https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00668.html
GCC 8.1 has been released.
GCC 8.2 has been released.
Created attachment 44478 [details] Failing test case As advised by James on the mailing list, I am adding the test case that is failing on at least AAcrh64 and x86. My proposed patch fixes this on AArch64, but I would like to add this test in the general gcc.c-torture/execute/ folder so that other targets can also check.
Author: wilco Date: Wed Jun 19 12:52:43 2019 New Revision: 272473 URL: https://gcc.gnu.org/viewcvs?rev=272473&root=gcc&view=rev Log: Simplify setjmp and non-local goto implementation (PR84521) This fixes and simplifies the setjmp and non-local goto implementation. Currently the virtual frame pointer is saved when using __builtin_setjmp or a non-local goto. Depending on whether a frame pointer is used, this may either save SP or FP with an immediate offset. However the goto or longjmp always updates the hard frame pointer. A receiver veneer in the original function then assigns the hard frame pointer to the virtual frame pointer, which should, if it works correctly, again assign SP or FP. However the special elimination code in eliminate_regs_in_insn doesn't do this correctly unless the frame pointer is used, and even if it worked by writing SP, the frame pointer would still be corrupted. A much simpler implementation is to always save and restore the hard frame pointer. This avoids 2 redundant instructions which add/subtract the virtual frame offset. A large amount of code can be removed as a result, including all implementations of TARGET_BUILTIN_SETJMP_FRAME_VALUE (all of which already use the hard frame pointer). The expansion of nonlocal_goto on PA can be simplied to just restore the hard frame pointer. This fixes the most obvious issues, however there are still issues on targets which define HARD_FRAME_POINTER_IS_FRAME_POINTER (arm, mips). Each function could have a different hard frame pointer, so a non-local goto may restore the wrong frame pointer (TARGET_BUILTIN_SETJMP_FRAME_VALUE could be useful for this). The i386 TARGET_BUILTIN_SETJMP_FRAME_VALUE was incorrect: if stack_realign_fp is true, it would save the hard frame pointer value but restore the virtual frame pointer which according to ix86_initial_elimination_offset can have a non-zero offset from the hard frame pointer. The ia64 implementation of nonlocal_goto seems incorrect since the helper function moves the the frame pointer value into the static chain register (so this patch does nothing to make it better or worse). AArch64 + x86-64 bootstrap OK, new test passes on AArch64, x86-64 and Arm. gcc/ PR middle-end/84521 * builtins.c (expand_builtin_setjmp_setup): Save hard_frame_pointer_rtx. (expand_builtin_setjmp_receiver): Do not emit sfp = fp move since we restore fp. * function.c (expand_function_start): Save hard_frame_pointer_rtx for non-local goto. * lra-eliminations.c (eliminate_regs_in_insn): Remove sfp = fp elimination code. (remove_reg_equal_offset_note): Remove unused function. * reload1.c (eliminate_regs_in_insn): Remove sfp = hfp elimination code. * config/arc/arc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove. (arc_builtin_setjmp_frame_value): Remove function. * config/avr/avr.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove. (avr_builtin_setjmp_frame_value): Remove function. * config/i386/i386.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove. (ix86_builtin_setjmp_frame_value): Remove function. * config/pa/pa.md (nonlocal_goto): Remove FP adjustment. * config/sparc/sparc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove. (sparc_builtin_setjmp_frame_value): Remove function. * config/vax/vax.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove. (vax_builtin_setjmp_frame_value): Remove function. * config/xtensa/xtensa.c (xtensa_frame_pointer_required): Force frame pointer if has_nonlocal_label. testsuite/ PR middle-end/84521 * gcc.c-torture/execute/pr84521.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr84521.c Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c trunk/gcc/config/arc/arc.c trunk/gcc/config/avr/avr.c trunk/gcc/config/i386/i386.c trunk/gcc/config/pa/pa.md trunk/gcc/config/sparc/sparc.c trunk/gcc/config/vax/vax.c trunk/gcc/config/xtensa/xtensa.c trunk/gcc/function.c trunk/gcc/lra-eliminations.c trunk/gcc/reload1.c trunk/gcc/testsuite/ChangeLog