Bug 55917 - Impossible to find/debug unhandled exceptions in an std::thread
Summary: Impossible to find/debug unhandled exceptions in an std::thread
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.7.2
: P3 enhancement
Target Milestone: 8.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-09 13:20 UTC by Tobias Ringström
Modified: 2017-06-12 16:39 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Test program to illustrate the problem (114 bytes, text/plain)
2013-01-09 13:22 UTC, Tobias Ringström
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Ringström 2013-01-09 13:20:28 UTC
If an unhandled exception occurs in an std::thread, the stack is unwound before std::terminate is called, which makes it impossible to find the location of the exception.

I emailed gcc-help about this about a year ago, and learned that a fix was supposed to be applied for 4.7, but that fix didn't work out. Here's my post:

  http://gcc.gnu.org/ml/gcc-help/2011-11/msg00140.html

The reason for the unwound stack is that libstdc++'s function that calls the user's thread function contains a try/catch all statement.  The supposed fix was to use noexcept on the internal thread main function, and the reason why it did not work out is described here:

  http://gcc.gnu.org/ml/gcc-help/2013-01/msg00068.html

I've tested 4.6, 4.7 and a 4.8 snapshot with identical results.  See the attached test program tmp4.cpp. Compile with:

  g++ -std=c++0x -g tmp4.cpp -lpthread

Run in gdb:

~> gdb a.out
[...]
(gdb) r
Starting program: /home/tobias/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fd0700 (LWP 10278)]
terminate called after throwing an instance of 'std::runtime_error'
  what():  foo

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff7fd0700 (LWP 10278)]
0x000000318f036285 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.14.90-24.fc16.9.x86_64 libgcc-4.6.3-2.fc16.x86_64 libstdc++-4.6.3-2.fc16.x86_64
(gdb) bt
#0  0x000000318f036285 in raise () from /lib64/libc.so.6
#1  0x000000318f037b9b in abort () from /lib64/libc.so.6
#2  0x00000031964bbc5d in __gnu_cxx::__verbose_terminate_handler() ()
   from /usr/lib64/libstdc++.so.6
#3  0x00000031964b9e16 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x00000031964b9e43 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x0000003196470b0b in ?? () from /usr/lib64/libstdc++.so.6
#6  0x000000318fc07d90 in start_thread () from /lib64/libpthread.so.0
#7  0x000000318f0f119d in clone () from /lib64/libc.so.6
(gdb) 

As you can see, neither the thread's main function, nor foo() is in the backtrace.
Comment 1 Tobias Ringström 2013-01-09 13:22:52 UTC
Created attachment 29120 [details]
Test program to illustrate the problem
Comment 2 Jonathan Wakely 2013-01-09 13:44:35 UTC
As I said at http://gcc.gnu.org/ml/gcc-help/2013-01/msg00068.html this can't be changed for std::thread, see http://gcc.gnu.org/ml/libstdc++/2012-12/msg00062.html and the patch at http://gcc.gnu.org/ml/libstdc++/2012-12/msg00068.html for discussion and code that absolutely requires exceptions of type __forced_unwind to be able to escape the thread start function, otherwise thread cancellation crashes the program, which is not acceptable.

If the function is not noexcept then it must catch exceptions to ensure std::terminate is called as required, and catching the exception causes the stack to be unwound. So it's difficult do anything about std::thread.

Please open a separate bug (with Component=c++) for the http://gcc.gnu.org/ml/gcc-help/2013-01/msg00058.html issue when noexcept interacts with a destructor on the stack.  That applies to code not using std::thread too.
Comment 3 Jonathan Wakely 2013-01-09 13:48:04 UTC
Ah, I se PR 55918 - thanks!
Comment 4 Jonathan Wakely 2013-01-09 13:59:58 UTC
One option would be to make the start function have a dynamic-exception-specification of throw(__cxxabiv1::__forced_unwind), which would allow the cancellation exception to propagate (as required) but prevent any other exceptions, but that would result in a call to std::unexpected() rather than std::terminate(), and a user could have replaced the unexpected_handler, so we would not be able to meet the requirement that std::terminate() is called. On top of that, I think the stack would still get unwound before the call to std:unexpected, and dynamic-exception-specifications and unexpected handlers are deprecated in C++11.  So I don't think we want to do that.

My inclination is to close this as WONTFIX.
Comment 5 Tobias Ringström 2013-01-09 14:06:08 UTC
Yes, I thought two reports were in order, as they are only vaguely related.  To me, this one is the most important problem.  I struggle to understand how I can be the first to have this problem.  Surely it must be an enormous problem if you use std::thread?  I'm working on a somewhat large multi-threaded program, and if there's an exception anywhere, e.g. a failed range-check in a container, it's *completely impossible* to find the problem in a debugger.  We've now switched to boost::thread instead because it does not have this problem.

I must say that I'm very surprised that you call it an enhancement, and that you consider closing it as WONTFIX, seeing how the current behavior is so mindbogglingly unfriendly.  There is a reason why GCC does not unwind the stack for non-threaded unhandled exceptions.

Perhaps std::thread is not widely used yet, or I'm the only one with buggy code?
Comment 6 Jonathan Wakely 2013-01-09 14:42:00 UTC
(In reply to comment #5)
> Yes, I thought two reports were in order, as they are only vaguely related.  To
> me, this one is the most important problem.  I struggle to understand how I can
> be the first to have this problem.  Surely it must be an enormous problem if
> you use std::thread?  I'm working on a somewhat large multi-threaded program,
> and if there's an exception anywhere, e.g. a failed range-check in a container,
> it's *completely impossible* to find the problem in a debugger.

If you're running in the debugger, rather than examining a core file post-mortem, you can use "catch throw" or "break __cxa_throw" to break when the exception is thrown.

Otherwise you already know the workaround, put 'noexcept' on the function you pass to std::thread (which breaks pthread_cancel but if you're not using that it doesn't matter.)

>  We've now
> switched to boost::thread instead because it does not have this problem.
> 
> I must say that I'm very surprised that you call it an enhancement,

The current implementation conforms to the standard.

> and that
> you consider closing it as WONTFIX,

I've explained why it can't easily be changed, unless anyone has some idea I haven't thought of yet to preserve pthread_cancel behaviour and preserve the requirement that std::terminate() is called.

> seeing how the current behavior is so
> mindbogglingly unfriendly.  There is a reason why GCC does not unwind the stack
> for non-threaded unhandled exceptions.

The behaviour comes directly from the explicit requirement in the standard that an unhandled exception in a std::thread must call std::terminate. 
 
If it's guaranteed that an uncaught exception leaving that function will still call terminate, then the catch(...) block could be removed. Maybe I could do that conditionally for targets where it's known to work as required.
Comment 7 Yuri Aksenov 2013-07-02 10:30:47 UTC
> If it's guaranteed that an uncaught exception leaving that function will still 
> call terminate, then the catch(...) block could be removed. Maybe I could do 
> that conditionally for targets where it's known to work as required.

I vote for this enhancement to remove catch(...) block from execute_native_thread_routine function.

According to standard (15.3) Handling an exception
> If no matching handler is found, the function std::terminate() is called; 
> whether or not the stack is unwound before this call to std::terminate() is 
> implementation-defined (15.5.1).

So if we remove catch(...), std::terminate should be called anyway (it's a bug otherwise). But GCC implementaion will leave stack unwound making it easier to debug.
Comment 8 Jonathan Wakely 2013-07-02 11:19:51 UTC
(In reply to Yuri Aksenov from comment #7)
> > If it's guaranteed that an uncaught exception leaving that function will still 
> > call terminate, then the catch(...) block could be removed. Maybe I could do 
> > that conditionally for targets where it's known to work as required.
> 
> I vote for this enhancement to remove catch(...) block from
> execute_native_thread_routine function.

This isn't a democracy ;)

> According to standard (15.3) Handling an exception
> > If no matching handler is found, the function std::terminate() is called; 
> > whether or not the stack is unwound before this call to std::terminate() is 
> > implementation-defined (15.5.1).
> 
> So if we remove catch(...), std::terminate should be called anyway (it's a
> bug otherwise).

The catch(...) is there to provide the [thread.thread.constr]/4 requirement that std::terminate is called. If it isn't there, are you saying Pthreads or the OS guarantees that an exception in the function passed to pthread_create() will cause a call to std::terminate()? Where is that guaranteed?  Where is it implemented?

You can't just say "rely on the implementation, it's a bug otherwise", this code **is** the implementation.
Comment 9 Jonathan Wakely 2013-07-02 11:26:53 UTC
P.S. I do want to improve the behaviour, and I think we can make a change for targets where we know the behaviour is OK, but we need to check carefully that it works correctly, not just remove the catch(...)
Comment 10 Yuri Aksenov 2013-07-02 12:05:32 UTC
OK, [thread.thread.constr]/4 says:
> If the invocation of INVOKE(decay_copy( std::forward<F>(f)), 
> decay_copy(std::forward<Args>(args))...) terminates with an uncaught exception, 
> std::terminate shall be called.

And [except.handle]/9 says:
> If no matching handler is found, the function std::terminate() is called

So we don't need to set default handler unless there is bug in implementation of [except.handle]/9

I have tested std::thread with removed catch(...) for both -m32 and -m64 environments on
gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (SUSE Linux) 
Target: x86_64-suse-linux

And I can test it on
Target: armv5b-softfloat-linux

But I can not be shure for any possible target
Comment 11 Yuri Aksenov 2013-07-02 12:27:30 UTC
> Where is that guaranteed?  Where is it implemented?

And here is stack trace of patched version, it seems to be implemented in __cxxabiv1::__cxa_throw

(gdb) bt
#0  0x00007ffff703ad25 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff703c1a8 in __GI_abort () at abort.c:91
#2  0x00007ffff791d68d in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007ffff791b796 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:40
#4  0x00007ffff791b7c3 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:50
#5  0x00007ffff791b9ee in __cxxabiv1::__cxa_throw (obj=0x7ffff0000940, tinfo=<optimized out>, dest=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
#6  0x00000000004010f2 in f () at main.cpp:5
#7  0x000000000040242d in std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (this=0x606040) at /usr/include/c++/4.7/functional:1598
#8  0x000000000040237d in std::_Bind_simple<void (*())()>::operator()() (this=0x606040) at /usr/include/c++/4.7/functional:1586
#9  0x0000000000402316 in std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (this=0x606028) at /usr/include/c++/4.7/thread:115
#10 0x000000000040266d in std::(anonymous namespace)::execute_native_thread_routine (__p=0x606028) at thread.cpp:73
#11 0x00007ffff7bc6e0e in start_thread (arg=0x7ffff7005700) at pthread_create.c:305
#12 0x00007ffff70ea2cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Comment 12 Jonathan Wakely 2013-07-02 15:46:06 UTC
Yes it will probably work on all platforms using the Itanium exception-handling ABI, unless the OS's pthread_create (or equiv) has explicit exception-handling.
Comment 13 Roger Orr 2014-11-24 14:53:25 UTC
I am also affected by this problem.
What is the status of the idea Jonathan made (2013-07-02 11:26:53 UTC):

  ... and I think we can make a change for targets where we know the behaviour is OK ...
Comment 14 Howard Hinnant 2016-05-20 21:16:10 UTC
The main function is not wrapped in a try / catch and an unhandled exception on the main thread still calls std::terminate() as required.

I believe that gcc already correctly implements the call to std::terminate() for threads elsewhere, and with the same code that it does for main.  It is in the function __cxa_throw, implemented by libsupc++.  Here is the specification:

http://mentorembedded.github.io/cxx-abi/abi-eh.html#cxx-throw

which says in part:

> __Unwind_RaiseException begins the process of stack unwinding, described in Section 2.5. In special cases, such as an inability to find a handler, _Unwind_RaiseException may return. In that case, __cxa_throw will call terminate, assuming that there was no handler for the exception.

And here is where it is implemented:

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_throw.cc#L87

As long as libsupc++ is being used on the platform, I think you are good to remove the try/catch from execute_native_thread_routine.

Fwiw, this is what is done on libc++ / libc++abi (no try/catch at the top of std::thread, and let __cxa_throw call terminate).  We get beautiful stack dumps from threaded code over there. :-)
Comment 15 Howard Hinnant 2016-06-20 21:37:55 UTC
To help clarify my proposal, here is a patch:

diff --git a/libstdc++-v3/src/c++11/thread.cc b/libstdc++-v3/src/c++11/thread.cc
index 906cafa..cfca178 100644
--- a/libstdc++-v3/src/c++11/thread.cc
+++ b/libstdc++-v3/src/c++11/thread.cc
@@ -79,19 +79,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
       thread::__shared_base_type __local;
       __local.swap(__t->_M_this_ptr);
 
-      __try
-	{
-	  __t->_M_run();
-	}
-      __catch(const __cxxabiv1::__forced_unwind&)
-	{
-	  __throw_exception_again;
-	}
-      __catch(...)
-	{
-	  std::terminate();
-	}
-
+       __t->_M_run();
       return nullptr;
     }
   }

My colleague Miguel Portilla has tested this patch with the following code:

#include <thread>
#include <stdexcept>

void a(int i);
void b(int i);
void c(int i);

int
main()
{
    auto t = std::thread{a, 3};
    t.join();
}

void
a(int i)
{
    b(i-1);
}

void
b(int i)
{
    c(i-1);
}

void
c(int i)
{
    throw std::runtime_error("A good message");
}

And the stack dump looks like:

Thread 2 (Thread 0x7f090b47c740 (LWP 1865)):
#0  0x00007f090b0688ed in pthread_join (threadid=139676803389184, thread_return=0x0) at pthread_join.c:90
#1  0x00007f090ad9c767 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
    at /home/mickey/gcc-5.4.0/build/x86_64-linux-gnu/libstdc++-v3/include/x86_64-linux-gnu/bits/gthr-default.h:668
#2  std::thread::join (this=0x7ffd1dae6430) at ../../../../../libstdc++-v3/src/c++11/thread.cc:96
#3  0x0000000000400f39 in main ()

Thread 1 (Thread 0x7f090a3fd700 (LWP 1866)):
#0  0x00007f090a433267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f090a434eca in __GI_abort () at abort.c:89
#2  0x00007f090ad747dd in __gnu_cxx::__verbose_terminate_handler ()
    at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007f090ad72866 in __cxxabiv1::__terminate (handler=<optimized out>)
    at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x00007f090ad728b1 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x00007f090ad72ac8 in __cxxabiv1::__cxa_throw (obj=0x7f0904000940, 
    tinfo=0x604360 <typeinfo for std::runtime_error@@GLIBCXX_3.4>, 
    dest=0x400ce0 <std::runtime_error::~runtime_error()@plt>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87
#6  0x0000000000400fdd in c(int) ()
#7  0x0000000000400fa0 in b(int) ()
#8  0x0000000000400f85 in a(int) ()
#9  0x0000000000402574 in void std::_Bind_simple<void (*(int))(int)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) ()
#10 0x0000000000402493 in std::_Bind_simple<void (*(int))(int)>::operator()() ()
#11 0x0000000000402432 in std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >::_M_run() ()
#12 0x00007f090ad9c820 in std::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#13 0x00007f090b0676aa in start_thread (arg=0x7f090a3fd700) at pthread_create.c:333
#14 0x00007f090a504e9d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Comment 16 Jonathan Wakely 2017-06-12 16:38:00 UTC
Author: redi
Date: Mon Jun 12 16:37:28 2017
New Revision: 249130

URL: https://gcc.gnu.org/viewcvs?rev=249130&root=gcc&view=rev
Log:
PR libstdc++/55917 do not handle exceptions in std::thread

	PR libstdc++/55917
	* src/c++11/thread.cc (execute_native_thread_routine): Remove
	try-block so that exceptions propagate out of the thread and terminate
	is called by the exception-handling runtime.
	(execute_native_thread_routine_compat): Likewise.
	* testsuite/30_threads/thread/cons/terminate.cc: New.

Added:
    trunk/libstdc++-v3/testsuite/30_threads/thread/cons/terminate.cc
Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/src/c++11/thread.cc
Comment 17 Jonathan Wakely 2017-06-12 16:39:43 UTC
Fixed for GCC 8.