Bug 100402 - [10.3 regression] crash with setjmp/longjmp and SEH
Summary: [10.3 regression] crash with setjmp/longjmp and SEH
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.3.0
: P3 normal
Target Milestone: 10.4
Assignee: Eric Botcazou
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-03 17:37 UTC by Hannes Domani
Modified: 2021-05-05 21:03 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-w64-mingw32
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-05-04 00:00:00


Attachments
preprocessed code (-E) (1.41 KB, text/plain)
2021-05-03 17:38 UTC, Hannes Domani
Details
assembly (-S) (381 bytes, text/plain)
2021-05-03 17:38 UTC, Hannes Domani
Details
output of -fdump-tree-optimized (539 bytes, text/plain)
2021-05-03 17:39 UTC, Hannes Domani
Details
Tentative fix (358 bytes, patch)
2021-05-05 01:02 UTC, Eric Botcazou
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hannes Domani 2021-05-03 17:37:03 UTC
The following example:
---------------------------------- >8 --------------------------------------
// gcc -O1 test.c

#include <setjmp.h>

static jmp_buf buf;
static bool stop = false;

void call_func(void(*func)(void))
{
  func();
}

void func()
{
  stop = true;
  longjmp(buf, 1);
}

int main()
{
  setjmp(buf);
  while (!stop)
    call_func(func);

  return 0;
}
---------------------------------- >8 --------------------------------------

Crashes deep in longjmp():

Starting program: C:\src\tests\a.exe
gdb: unknown target exception 0xc0000028 at 0x779dd7e8

Program received signal ?, Unknown signal.
0x00000000779dd7e8 in ntdll!RtlRaiseStatus () from C:\Windows\SYSTEM32\ntdll.dll
(gdb) bt
#0  0x00000000779dd7e8 in ntdll!RtlRaiseStatus () from C:\Windows\SYSTEM32\ntdll.dll
#1  0x000000007797f4ec in ntdll!RtlIsDosDeviceName_U () from C:\Windows\SYSTEM32\ntdll.dll
#2  0x000007fefe11e5a3 in msvcrt!longjmp () from C:\Windows\system32\msvcrt.dll
#3  0x000000013fe61620 in func ()
#4  0x000000013fe61627 in call_func ()
#5  0x000000013fe61664 in main ()
#6  0x000000013fe61431 in __tmainCRTStartup () at C:/gcc/src/mingw-w64-v8.0.0/mingw-w64-crt/crt/crtexe.c:345
#7  0x000000013fe615b6 in mainCRTStartup () at C:/gcc/src/mingw-w64-v8.0.0/mingw-w64-crt/crt/crtexe.c:220

Note: Doesn't crash with -O0, or if I change `while (!stop)` to `if (!stop)`.
And it also works fine with gcc-10.2, or if I revert both fixes [1] [2] of PR99234.

[1] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=49a714e59194a7c549aa6657676a1b4be4520650
[2] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=2939b358936bb824330888def98ad848dea41483
Comment 1 Hannes Domani 2021-05-03 17:38:20 UTC
Created attachment 50743 [details]
preprocessed code (-E)
Comment 2 Hannes Domani 2021-05-03 17:38:57 UTC
Created attachment 50744 [details]
assembly (-S)
Comment 3 Hannes Domani 2021-05-03 17:39:50 UTC
Created attachment 50745 [details]
output of -fdump-tree-optimized
Comment 4 Richard Biener 2021-05-04 07:01:53 UTC
Works on x86_64-linux.
Comment 5 Eric Botcazou 2021-05-04 22:36:25 UTC
Weird.  Can you post the diff of the assembly between 10.2 and 10.3?
Comment 6 Eric Botcazou 2021-05-04 22:48:11 UTC
Your testcase does not compile with the C compiler:

pr100402.c:4:8: error: unknown type name 'bool'
    4 | static bool stop = false;
      |        ^~~~
pr100402.c:4:20: error: 'false' undeclared here (not in a function)
    4 | static bool stop = false;
      |                    ^~~~~
pr100402.c: In function 'func':
pr100402.c:13:10: error: 'true' undeclared (first use in this function)
   13 |   stop = true;
      |          ^~~~
pr100402.c:13:10: note: each undeclared identifier is reported only once for each function it appears in

Compiling it with the C++ compiler works for me:

$ g++ -o pr100402 pr100402.C -O
$ ./pr100402.exe 
$
Comment 7 Eric Botcazou 2021-05-04 23:14:27 UTC
Indeed something does not work with -O:

@ ./pr100402.exe 
$ echo $?
127
Comment 8 Eric Botcazou 2021-05-05 00:43:07 UTC
Ugh, there is some black magic involved with setjmp and SEH:

#    ifdef __SEH__
#     if (__MINGW_GCC_VERSION < 40702)
#      define setjmp(BUF) _setjmpex((BUF), mingw_getsp())
#      define setjmpex(BUF) _setjmpex((BUF), mingw_getsp())
#     else
#      define setjmp(BUF) _setjmpex((BUF), __builtin_frame_address (0))
#      define setjmpex(BUF) _setjmpex((BUF), __builtin_frame_address (0))
#     endif
#    else
#      define setjmp(BUF) _setjmpex((BUF), NULL)
#      define setjmpex(BUF) _setjmpex((BUF), NULL)
#    endif

so the original code would leave the frame pointer misaligned whereas we now realign it.
Comment 9 Eric Botcazou 2021-05-05 01:00:12 UTC
Investigating.
Comment 10 Eric Botcazou 2021-05-05 01:02:57 UTC
Created attachment 50754 [details]
Tentative fix

Please give it a try if you can rebuild the compiler.
Comment 11 Hannes Domani 2021-05-05 15:44:31 UTC
> Your testcase does not compile with the C compiler:
> 
> Compiling it with the C++ compiler works for me:

Sorry about the c/c++ confusion.


> Created attachment 50754 [details]
> Tentative fix
> 
> Please give it a try if you can rebuild the compiler.

Yes, I can confirm that this fixes the crash with my attached test, and also the original non-reduced application (which is gdb).
Comment 12 CVS Commits 2021-05-05 20:55:35 UTC
The master branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:e8d1ca7d2c344a411779892616c423e157f4aea8

commit r12-525-ge8d1ca7d2c344a411779892616c423e157f4aea8
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Wed May 5 22:48:51 2021 +0200

    Fix PR target/100402
    
    This is a regression for 64-bit Windows present from mainline down to the 9
    branch and introduced by the fix for PR target/99234.  Again SEH, but with
    a twist related to the way MinGW implements setjmp/longjmp, which turns out
    to be piggybacked on SEH with recent versions of MinGW, i.e. the longjmp
    performs a bona-fide unwinding of the stack, because it calls RtlUnwindEx
    with the second argument initially passed to setjmp, which is the result of
    __builtin_frame_address (0) in the MinGW header file:
    
      define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))
    
    This means that we directly expose the frame pointer to the SEH machinery
    here (unlike with regular exception handling where we use an intermediate
    CFA) and thus that we cannot do whatever we want with it.  The old code
    would leave it unaligned, i.e. not multiple of 16, whereas the new code
    aligns it, but this breaks for some reason; at least it appears that a
    .seh_setframe directive with 0 as second argument always works, so the
    fix aligns it this way.
    
    gcc/
            PR target/100402
            * config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
            always return the establisher frame for __builtin_frame_address (0).
    gcc/testsuite/
            * gcc.c-torture/execute/20210505-1.c: New test.
Comment 13 CVS Commits 2021-05-05 20:56:54 UTC
The releases/gcc-11 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:e9a8d6852c966e0511f2cfe40224fd81cbeaae24

commit r11-8358-ge9a8d6852c966e0511f2cfe40224fd81cbeaae24
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Wed May 5 22:48:51 2021 +0200

    Fix PR target/100402
    
    This is a regression for 64-bit Windows present from mainline down to the 9
    branch and introduced by the fix for PR target/99234.  Again SEH, but with
    a twist related to the way MinGW implements setjmp/longjmp, which turns out
    to be piggybacked on SEH with recent versions of MinGW, i.e. the longjmp
    performs a bona-fide unwinding of the stack, because it calls RtlUnwindEx
    with the second argument initially passed to setjmp, which is the result of
    __builtin_frame_address (0) in the MinGW header file:
    
      define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))
    
    This means that we directly expose the frame pointer to the SEH machinery
    here (unlike with regular exception handling where we use an intermediate
    CFA) and thus that we cannot do whatever we want with it.  The old code
    would leave it unaligned, i.e. not multiple of 16, whereas the new code
    aligns it, but this breaks for some reason; at least it appears that a
    .seh_setframe directive with 0 as second argument always works, so the
    fix aligns it this way.
    
    gcc/
            PR target/100402
            * config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
            always return the establisher frame for __builtin_frame_address (0).
    gcc/testsuite/
            * gcc.c-torture/execute/20210505-1.c: New test.
Comment 14 CVS Commits 2021-05-05 20:58:02 UTC
The releases/gcc-10 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:e3abcc56d2604b9d2652b615ff9e68981cb7f79e

commit r10-9804-ge3abcc56d2604b9d2652b615ff9e68981cb7f79e
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Wed May 5 22:48:51 2021 +0200

    Fix PR target/100402
    
    This is a regression for 64-bit Windows present from mainline down to the 9
    branch and introduced by the fix for PR target/99234.  Again SEH, but with
    a twist related to the way MinGW implements setjmp/longjmp, which turns out
    to be piggybacked on SEH with recent versions of MinGW, i.e. the longjmp
    performs a bona-fide unwinding of the stack, because it calls RtlUnwindEx
    with the second argument initially passed to setjmp, which is the result of
    __builtin_frame_address (0) in the MinGW header file:
    
      define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))
    
    This means that we directly expose the frame pointer to the SEH machinery
    here (unlike with regular exception handling where we use an intermediate
    CFA) and thus that we cannot do whatever we want with it.  The old code
    would leave it unaligned, i.e. not multiple of 16, whereas the new code
    aligns it, but this breaks for some reason; at least it appears that a
    .seh_setframe directive with 0 as second argument always works, so the
    fix aligns it this way.
    
    gcc/
            PR target/100402
            * config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
            always return the establisher frame for __builtin_frame_address (0).
    gcc/testsuite/
            * gcc.c-torture/execute/20210505-1.c: New test.
Comment 15 CVS Commits 2021-05-05 21:01:19 UTC
The releases/gcc-9 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:182dba3fd507487a724090f1c95eea99a1b9ccad

commit r9-9515-g182dba3fd507487a724090f1c95eea99a1b9ccad
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Wed May 5 22:48:51 2021 +0200

    Fix PR target/100402
    
    This is a regression for 64-bit Windows present from mainline down to the 9
    branch and introduced by the fix for PR target/99234.  Again SEH, but with
    a twist related to the way MinGW implements setjmp/longjmp, which turns out
    to be piggybacked on SEH with recent versions of MinGW, i.e. the longjmp
    performs a bona-fide unwinding of the stack, because it calls RtlUnwindEx
    with the second argument initially passed to setjmp, which is the result of
    __builtin_frame_address (0) in the MinGW header file:
    
      define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))
    
    This means that we directly expose the frame pointer to the SEH machinery
    here (unlike with regular exception handling where we use an intermediate
    CFA) and thus that we cannot do whatever we want with it.  The old code
    would leave it unaligned, i.e. not multiple of 16, whereas the new code
    aligns it, but this breaks for some reason; at least it appears that a
    .seh_setframe directive with 0 as second argument always works, so the
    fix aligns it this way.
    
    gcc/
            PR target/100402
            * config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
            always return the establisher frame for __builtin_frame_address (0).
    gcc/testsuite/
            * gcc.c-torture/execute/20210505-1.c: New test.
Comment 16 Eric Botcazou 2021-05-05 21:03:48 UTC
> Yes, I can confirm that this fixes the crash with my attached test, and also
> the original non-reduced application (which is gdb).

OK, thanks for testing and for reporting the problem in the first place.