cilk_fiber-unix.cpp has #pragma GCC push_options #pragma GCC optimize ("-O0") NORETURN cilk_fiber_sysdep::run() { It fails when compiled with any optimization. This function has __builtin_setjmp and __builtin_longjmp within the same function. When __builtin_longjmp is put in a separate function, optimization works. But there is no documentation on how __builtin_longjmp and __builtin_setjmp should be used, like the jump buffer size and fields as well as any limitation. If it is true that __builtin_setjmp and __builtin_longjmp can't be used within the same function, GCC should issue an error, at least a warning.
__builtin_longjmp/setjmp are just longjmp(3) setjmp(3) with their constraints. They should not be used directly but <setjmp.h> should be. The file looks scary.
I suppose // alloca() to force generation of frame pointer. The argument to alloca // is contrived to prevent the compiler from optimizing it away. This // code should never actually be executed. int* dummy = (int*) alloca((sizeof(int) + (std::size_t) m_start_proc) & 0x1); *dummy = 0xface; is optimized away. Try using volatile int *.
(In reply to Richard Biener from comment #1) > __builtin_longjmp/setjmp are just longjmp(3) setjmp(3) with their > constraints. > They should not be used directly but <setjmp.h> should be. That was my first impression. But I saw [hjl@gnu-hsw-1 tmp]$ cat y.c #include <setjmp.h> extern jmp_buf buf; void foo () { __builtin_setjmp (buf); } void bar () { __builtin_longjmp (buf, 1); } [hjl@gnu-hsw-1 tmp]$ gcc -S -O2 y.c [hjl@gnu-hsw-1 tmp]$ cat y.s .file "y.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc movq %rsp, buf(%rip) movq $.L2, buf+8(%rip) movq %rsp, buf+16(%rip) ret .L2: .L4: .cfi_endproc .LFE0: .size foo, .-foo .p2align 4,,15 .globl bar .type bar, @function bar: .LFB1: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq buf+8(%rip), %rax movq %rsp, %rbp .cfi_def_cfa_register 6 movq buf(%rip), %rbp movq buf+16(%rip), %rsp jmp *%rax .cfi_endproc .LFE1: .size bar, .-bar __builtin_longjmp/setjmp only save/restore BP, SP and PC on x86, x32 and x86-64.
Ah, those are the builtins used for SJLJ and they get lowered to setjmp_setup,dispatcher and longjmp. Don't use __builtin_setjmp as I said, use setjmp. I suppose these shouldn't have been made callable by user code? Eric?
But it seems cilk depends on implementation details of setjmp/longjmp which looks bogus anyway.
(In reply to Richard Biener from comment #4) > Ah, those are the builtins used for SJLJ and they get lowered to > setjmp_setup,dispatcher and longjmp. > > Don't use __builtin_setjmp as I said, use setjmp. > > I suppose these shouldn't have been made callable by user code? Eric? There are a few run-time testcases in gcc/testsuite, which indicates that they work in user code and may have some limitations.
> Ah, those are the builtins used for SJLJ and they get lowered to > setjmp_setup,dispatcher and longjmp. Right, they are the efficient version of setjmp/longjmp. > I suppose these shouldn't have been made callable by user code? Eric? Yes, but it's probably too late to change that.
(In reply to Eric Botcazou from comment #7) > > Ah, those are the builtins used for SJLJ and they get lowered to > > setjmp_setup,dispatcher and longjmp. > > Right, they are the efficient version of setjmp/longjmp. > What restrictions do they have? Can they be used within the same function? Can they be used within functions with parameters?
> What restrictions do they have? Can they be used within the > same function? Can they be used within functions with parameters? The only restriction I know of is that they cannot be in the same function, it's even enforced by the inliner: case BUILT_IN_LONGJMP: /* We can't inline functions that call __builtin_longjmp at all. The non-local goto machinery really requires the destination be in a different function. If we allow the function calling __builtin_longjmp to be inlined into the function calling __builtin_setjmp, Things will Go Awry. */ inline_forbidden_reason = G_("function %q+F can never be inlined because " "it uses setjmp-longjmp exception handling");
(In reply to Eric Botcazou from comment #9) > > What restrictions do they have? Can they be used within the > > same function? Can they be used within functions with parameters? > > The only restriction I know of is that they cannot be in the same function, > it's even enforced by the inliner: > > case BUILT_IN_LONGJMP: > /* We can't inline functions that call __builtin_longjmp at > all. The non-local goto machinery really requires the > destination be in a different function. If we allow the > function calling __builtin_longjmp to be inlined into the > function calling __builtin_setjmp, Things will Go Awry. */ > inline_forbidden_reason > = G_("function %q+F can never be inlined because " > "it uses setjmp-longjmp exception handling"); But gcc doesn't issue anything for this testcase: [hjl@gnu-hsw-1 tmp]$ cat z.c #include <setjmp.h> extern jmp_buf buf; int foo (int i) { int j = i + 1; if (__builtin_setjmp (buf)) { j += 1; __builtin_longjmp (buf, 1); } return j + i; } [hjl@gnu-hsw-1 tmp]$ gcc -S -O2 z.c -Wall [hjl@gnu-hsw-1 tmp]$ For this testcase: [hjl@gnu-hsw-1 tmp]$ cat i.c #include <setjmp.h> extern jmp_buf buf; int foo (int i) { int j = i + 1; if (!__builtin_setjmp (buf)) { j += 1; } return j + i; } [hjl@gnu-hsw-1 tmp]$ gcc -O2 -S i.c [hjl@gnu-hsw-1 tmp]$ cat i.s .file "i.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc .L4: .L2: movq %rsp, buf(%rip) movq $.L2, buf+8(%rip) leal 2(%rdi,%rdi), %eax movq %rsp, buf+16(%rip) ret .cfi_endproc .LFE0: .size foo, .-foo %rdi holds the function parameter 'i'. But when __builtin_longjmp is called, %rdi can have some random value. GCC doesn't save %rdi first.
In the latter testcase foo doesn't call a function so there is never a need to save anything.
> %rdi holds the function parameter 'i'. But when __builtin_longjmp is > called, %rdi can have some random value. GCC doesn't save %rdi first. No, __builtin_longjmp doesn't touch %rdi at all. Don't worry too much, the SJLJ mechanism of the C++ and Ada compilers has been piggybacked on this for a couple of decades, this is quite robust.
(In reply to Eric Botcazou from comment #12) > No, __builtin_longjmp doesn't touch %rdi at all. Don't worry too much, the > SJLJ mechanism of the C++ and Ada compilers has been piggybacked on this for > a couple of decades, this is quite robust. This is good to hear. What is each field? I assume that the first 3 fields are frame address, resume address and stack address. Are the same for all targets? What are the maximum number of fields?
> This is good to hear. What is each field? I assume that > the first 3 fields are frame address, resume address and > stack address. Are the same for all targets? What are > the maximum number of fields? Everything is in the manual I think, otherwise certainly in the sources.
(In reply to Eric Botcazou from comment #14) > > This is good to hear. What is each field? I assume that > > the first 3 fields are frame address, resume address and > > stack address. Are the same for all targets? What are > > the maximum number of fields? > > Everything is in the manual I think, otherwise certainly in the sources. I couldn't find anything in GCC manual.
(In reply to H.J. Lu from comment #15) > (In reply to Eric Botcazou from comment #14) > > > This is good to hear. What is each field? I assume that > > > the first 3 fields are frame address, resume address and > > > stack address. Are the same for all targets? What are > > > the maximum number of fields? > > > > Everything is in the manual I think, otherwise certainly in the sources. > > I couldn't find anything in GCC manual. See tm.texi / md.texi.
(In reply to Richard Biener from comment #16) > > I couldn't find anything in GCC manual. > > See tm.texi / md.texi. This is the only thing I found: -- Macro: DONT_USE_BUILTIN_SETJMP Define this macro to 1 if the 'setjmp'/'longjmp'-based scheme should use the 'setjmp'/'longjmp' functions from the C library instead of the '__builtin_setjmp'/'__builtin_longjmp' machinery. It doesn't say they can't be used in the same function, the size of the jump buffer, nor what each field in the jump buffer is used for.
> I couldn't find anything in GCC manual. There are a few documented hooks, but this looks quite light indeed, so the sources are probably the best references, i.e. builtins.c and except.c: /* __builtin_longjmp is passed a pointer to an array of five words (not all will be used on all machines). It operates similarly to the C library function of the same name, but is more efficient. Much of the code below is copied from the handling of non-local gotos. */ static void expand_builtin_longjmp (rtx buf_addr, rtx value) { rtx fp, lab, stack, insn, last; enum machine_mode sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL); void init_eh (void) { [...] #ifdef DONT_USE_BUILTIN_SETJMP #ifdef JMP_BUF_SIZE tmp = size_int (JMP_BUF_SIZE - 1); #else /* Should be large enough for most systems, if it is not, JMP_BUF_SIZE should be defined with the proper value. It will also tend to be larger than necessary for most systems, a more optimal port will define JMP_BUF_SIZE. */ tmp = size_int (FIRST_PSEUDO_REGISTER + 2 - 1); #endif #else /* builtin_setjmp takes a pointer to 5 words. */ tmp = size_int (5 * BITS_PER_WORD / POINTER_SIZE - 1); #endif tmp = build_index_type (tmp); tmp = build_array_type (ptr_type_node, tmp); f_jbuf = build_decl (BUILTINS_LOCATION, FIELD_DECL, get_identifier ("__jbuf"), tmp); #ifdef DONT_USE_BUILTIN_SETJMP /* We don't know what the alignment requirements of the runtime's jmp_buf has. Overestimate. */ DECL_ALIGN (f_jbuf) = BIGGEST_ALIGNMENT; DECL_USER_ALIGN (f_jbuf) = 1; #endif
(In reply to Eric Botcazou from comment #18) > > I couldn't find anything in GCC manual. > > There are a few documented hooks, but this looks quite light indeed, so the > sources are probably the best references, i.e. builtins.c and except.c: > Would it be OK to submit it a patch to document __builtin_longjmp/__builtin_setjmp based on their sources?
> Would it be OK to submit it a patch to document > __builtin_longjmp/__builtin_setjmp based on their > sources? I think that we would need to issue an error if both are in the same function.
It's only an error if they use the same jmpbuf.
Author: bviyer Date: Fri Nov 8 19:52:27 2013 New Revision: 204592 URL: http://gcc.gnu.org/viewcvs?rev=204592&root=gcc&view=rev Log: +2013-11-08 Balaji V. Iyer <balaji.v.iyer@intel.com> + + PR c/59039 + * runtime/cilk_fiber-unix.cpp: Fixed a crash in run() function + when optimization is turned on. + Modified: trunk/libcilkrts/ChangeLog trunk/libcilkrts/runtime/cilk_fiber-unix.cpp
Created attachment 31186 [details] A patch to document __builtin_setjmp/__builtin_longjmp Does it look OK?
> Does it look OK? Mostly, but I wouldn't go into full details about what contains the buffer, this is machine-specific and not portable. Maybe something like: "The @code{setjmp} buffer is an array of five @code{intptr_t}. The buffer will generally contain the frame address, the resume address and the stack address. The other elements may be used in a machine-specific way."
(In reply to H.J. Lu from comment #23) > Does it look OK? I would not say that __builtin_setjmp is more efficient. It really saves just as many registers, except that it has help from the containing function's prologue/epilogue to do so, rather than saving them all within the jmpbuf. Which means that the set of registers saved is the ISA under which the setjmp was compiled. Which makes the builtin less flexible than the libc variant when it comes to ISA extensions such as we see on ARM and PPC. I'm not keen on encouraging any user to use these functions. It's simply not worth it to us as maintainers. The fact that we've got code in libgcc that uses them means that we must continue to have these functions callable by some means. If folks would be happier if we hid these from users by making them only callable with some special option like -fbuild-libgcc, I could live with that.
> I would not say that __builtin_setjmp is more efficient. It really saves > just as many registers, except that it has help from the containing > function's prologue/epilogue to do so, rather than saving them all within the > jmpbuf. [As well as from the register allocator]. I think it's more efficient though, for example if you implement an SJLJ exception scheme on top of it. > I'm not keen on encouraging any user to use these functions. > It's simply not worth it to us as maintainers. > > The fact that we've got code in libgcc that uses them means that we must > continue to have these functions callable by some means. If folks would be > happier if we hid these from users by making them only callable with some > special option like -fbuild-libgcc, I could live with that. IMO it's too late to hide them after 2 decades. If people are really afraid of them, then we could keep the status quo.
Balaji fixed the ICE a while back. Based on c#26, I don't think we should hide the functions from being used/called from user code. So the only issue left is the doc fix, right?
> Balaji fixed the ICE a while back. Based on c#26, I don't think we should > hide the functions from being used/called from user code. So the only issue > left is the doc fix, right? That's my understanding, yes.
*** Bug 69887 has been marked as a duplicate of this bug. ***
I've been looking at this issue. The proposed patch has a number of problems, notably that it uses "@code{setjmp} buffer" to describe the argument that is *not* the same as the buffer you would pass to the setjmp function. (It's a different flavor of buffer.) I'm also thinking that "@code{__builtin_setjmp} and @code{__builtin_longjmp} may not be used in the same function with the same @code{setjmp} buffer" doesn't accurately capture the restriction. If builtin_setjmp uses its containing function's prologue/epilogue to do the register saves and restores (comment 25), it seems like that also implies that builtin_longjmp can only be called from functions called directly or indirectly from that containing function. And indeed, at least on nios2-elf this program wanders off into the weeds: #include <stdint.h> int my_setjmp (intptr_t *buf) { return __builtin_setjmp (buf); } void my_longjmp (intptr_t *buf) { __builtin_longjmp (buf, 1); } int main (void) { intptr_t buf[5]; int ret; ret = my_setjmp (buf); if (ret == 0) my_longjmp (buf); return ret; } whereas calling __builtin_setjmp directly from main works and returns 1 as expected.
What you describe Sandra is mentioned in the man page for longjmp(3). Maybe we can steal some of its documentation. Caveats If the function which called setjmp() returns before longjmp() is called, the behavior is undefined. Some kind of subtle or unsubtle chaos is sure to result.
New patch posted for review here: https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00004.html
Author: sandra Date: Tue Dec 4 04:22:37 2018 New Revision: 266770 URL: https://gcc.gnu.org/viewcvs?rev=266770&root=gcc&view=rev Log: 2018-12-03 Sandra Loosemore <sandra@codesourcery.com> PR c/59039 gcc/ * doc/extend.texi (Nonlocal gotos): New section. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/extend.texi
Fixed on trunk.