Created attachment 38911 [details] test case to reproduce crash problem This problem happens when we implement a user space context switching framework by setjmp&longjmp. The attached file is a simple case can reproduce this problem. We create a thread by pthread_create and mmap two memory blocks as the stack pool of it. And then use setjmp&longjmp to make the thread switch between these two stacks. We call the stack which the pthread_create allocate for the thread as original stack, and the other two mmap stacks as stack 1 and stack 2. The thread only switchs from original stack to stack 1 once after it created and then only switchs between stack 1 and stack 2. Then the result is that if release stack 1 when the thread runs on stack 2 and cancel the thread, libgcc will crash the process when do unwind in cancel handler. It try to visit some where on stack 1 which has been released. However whenever we release stack 2 and cancel the thread, libgcc will run ok. We first found this problem on Wind River's commercial version and then reproduce on other free release. We have tested on X86_64, MIPS, PPC and found it only happens on X86_64. Compile the case file simply with "gcc -lpthread my_test.c -o my_test" If use -fno-asynchronous-unwind-tables to not generate the unwind table, the process will not crash. the version infomation: 1. $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/4.9.3/lto-wrapper Target: x86_64-linux-gnu Configured with: /build/distro/work/shared/gcc-4.9.3/configure --build=none --host=x86_64-linux-gnu --target=x86_64-linux-gnu --prefix=/usr --with-sysroot=/ --with-build-sysroot=/build/distro/work/x86_64/rootfs/x86_64-linux-gnu --disable-nls --disable-bootstrap --enable-languages=c,c++ --with-system-zlib --enable-shared --disable-static --with-pkgversion=distro-v2.5-sctpmh --disable-install-libiberty --with-arch=core2 --disable-multilib Thread model: posix gcc version 4.9.3 (distro-v2.5-sctpmh) 2. $ gcc -v Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.5/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.5 20051201 (Red Hat 3.4.5-2) 3. Thread model: posix gcc version 4.4.1 (Wind River Linux Sourcery G++ 4.4a-450)
I don't think this is a valid thing to do with setjmp and longjmp. Why are you not using makecontext/setcontext/getcontext/swapcontext instead? Also why do you think this is a libgcc bug because if you try to unwind the stack using gdb, you will get the same behavior because the stack was that one thread is now on the other one but the that thread has now exited.
(In reply to Andrew Pinski from comment #1) > I don't think this is a valid thing to do with setjmp and longjmp. > > Why are you not using makecontext/setcontext/getcontext/swapcontext instead? > > Also why do you think this is a libgcc bug because if you try to unwind the > stack using gdb, you will get the same behavior because the stack was that > one thread is now on the other one but the that thread has now exited. Thanks. I turn to makecontext/swapcontext and works well. But, back to this problem, I still think it is something wrong in the tool chain, gcc or libgcc. gcc generate the asynchronous-unwind-tables and libgcc use them to do unwind. As my understanding, when do unwind for a thread, you shall not visit other context not belong to this thread now. And why this only happens on X86_64? Is it related to the special definition of unwind tables according to the X86_64 ABI which has some difference with formal DWARF?