Created attachment 26904 [details] Testcase to reproduce There's a problem with mt_allocator when using multi-threads. see example attached (g++ -lpthread). At exit time, - the __freelist destructor is called. which does a __gthread_key_delete(_M_key). - and then afterwards, the std::list<> destructor is called. This finally have for effect to use that thread key with gthread_setspecific. Valgrind reports : ================================= ==5212== Invalid read of size 8 ==5212== at 0x4CA009D: __gnu_cxx::__pool<true>::_M_get_thread_id() (in ../gcc-4.6.3/lib64/libstdc++.so.6.0.16) ==5212== by 0x4CA0173: __gnu_cxx::__pool<true>::_M_reclaim_block(char*, unsigned long) (in ../gcc-4.6.3/lib64/libstdc++.so.6.0.16) ==5212== by 0x401989: __gnu_cxx::__mt_alloc<std::_List_node<std::string>, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> >::deallocate(std::_List_node<std::string>*, unsigned long) (in /tmp/a.out) ==5212== by 0x401847: std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::_M_put_node(std::_List_node<std::string>*) (in /tmp/a.out) ==5212== by 0x40168C: std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::_M_clear() (in /tmp/a.out) ==5212== by 0x40151A: std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::~_List_base() (in /tmp/a.out) ==5212== by 0x401BB1: std::list<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::~list() (in /tmp/a.out) ==5212== by 0x3814E30C94: exit (in /lib64/tls/libc-2.3.4.so) ==5212== by 0x3814E1C411: (below main) (in /lib64/tls/libc-2.3.4.so) ==5212== Address 0x5b5ce88 is 24 bytes inside a block of size 65,536 free'd ==5212== at 0x4A077EC: operator delete(void*) (vg_replace_malloc.c:457) ==5212== by 0x3814E30C94: exit (in /lib64/tls/libc-2.3.4.so) ==5212== by 0x3814E1C411: (below main) (in /lib64/tls/libc-2.3.4.so) ================================= I ve seen a quite close bug report (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22309), that is already fixed in 4.0.0. This apparently leaded to re-write a part of mt_allocate (Jakub Jelinek). But this was not sufficient for the current bug report i m reporting. Thanks in advance. Laurent Alfonsi
I would propose to fix as follows : - Set the _M_Key=NULL when calling the freelist desctuctor. - Testing the NULL Key before using it with setspecific. see patch below : =================================================================== --- libstdc++-v3/src/mt_allocator.cc (revision 2486) +++ libstdc++-v3/src/mt_allocator.cc (working copy) @@ -47,6 +47,7 @@ if (_M_thread_freelist_array) { __gthread_key_delete(_M_key); + _M_key = NULL; ::operator delete(static_cast<void*>(_M_thread_freelist_array)); } } @@ -639,7 +640,8 @@ } } - __gthread_setspecific(freelist._M_key, (void*)_M_id); + if (freelist._M_key) + __gthread_setspecific(freelist._M_key, (void*)_M_id); } return _M_id >= _M_options._M_max_threads ? 0 : _M_id; } =================================================================== Let me know what you think Thanks very much Laurent Alfonsi
To be clear: are you seeing the valgrind errors going away with the patchlet applied? First blush, I don't.
Well, in fact I am facing a runtime crash on another target (SH4). The crash is fixed by the patch proposed previously. On the other hand, I ve tried to reproduce on x86, to easily report on gcc bugs. and I ve discovered this valgrind error. I first thought both issues were the same, but they are not. And you are right, my patch doesn't fix the valgrind error. Sorry for confusion. The _M_thread_freelist might also be set to NULL at some point. (may be also in ~__freelist()). Setting _M_thread_freelist to NULL at ~__freelist() is solves the valgrind error, but I m not sure this is the right place to do it. Thanks
In principle I have no problem with such zeroings, make sense, but it's been a while since the last time I looked into this code and I fear races. In any case, please try to fully run the testsuite (check-performance too, which exercised mt_allocator quite a bit) on the affected targets. Jakub, any comments on this?
It surprises me. The destructors of libstdc++, on which the binary depends on, should be run after the destructors of the binary, not before it.
Sorry for the late answer (sickness). Well, In fact there are 2 things in the atexit list : 1) The destruction of the "list_type" global variable. Destructor is : "std::list<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::~list()" => This one is registered in atexit list by the __static_initialization_and_destruction_0() because it is a global variable. 2) The destruction of the mt_allocator itself "(anonymous namespace)::__freelist::~__freelist()" => This one is registered in atexit list during the main() execution. when first using the mt_allocator (first called get_freelist()) The above 2 destructors are registered in atexit list in this order (1 then 2). Then, they are invoked in the reverse order (2 then 1). The freelist is first destroyed (where i proposed to add zeroing), followed by the destruction of the list_type that invokes _M_get_thread_id() : __gnu_cxx::__pool<true>::_M_get_thread_id () __gnu_cxx::__pool<true>::_M_reclaim_block () __gnu_cxx::__mt_alloc<std::_List_node<std::string>, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> >::deallocate(std::_List_node<std::string>*, unsigned long) () std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::_M_put_node(std::_List_node<std::string>*) () std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::_M_clear() () std::_List_base<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::~_List_base() () std::list<std::string, __gnu_cxx::__mt_alloc<std::string, __gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >::~list() () exit () Jakub, do you think this order of destruction is not justified ? Do you see another solution than the one i proposed ? Thanks very much. Laurent
Ping ? Here is the patch proposed. ============================ --- libstdc++-v3/src/mt_allocator.cc 2011/08/04 07:56:49 2064 +++ libstdc++-v3/src/mt_allocator.cc 2012/04/03 08:45:48 2551 @@ -47,7 +47,9 @@ if (_M_thread_freelist_array) { __gthread_key_delete(_M_key); + _M_key = NULL; ::operator delete(static_cast<void*>(_M_thread_freelist_array)); + _M_thread_freelist = NULL; } } }; @@ -639,7 +641,10 @@ } } - __gthread_setspecific(freelist._M_key, (void*)_M_id); + if (freelist._M_key) + { + __gthread_setspecific(freelist._M_key, (void*)_M_id); + } } return _M_id >= _M_options._M_max_threads ? 0 : _M_id; } =====================================
Note, stylistically, no curly brackets in the if body, and also no NULL, just 0 in C++98. I also note that you are not patching mainline, all the fixes go to mainline first. Finally, more substantively, how did you test the patch? Running make check-performance exercises mt_allocator quite a bit, for example. Are all the leaks / crashes gone with the patch?
Paolo, I ve discovered that the encoding of M_key is encoded differently for each thread-implementation. On pthread implementation M_key is an integer, whereas on others it is pointers. So, in pthread 0 is a legal thread key: So zeroing M_key and testing if(M_key) is simply wrong. The M_thread_freelist zeroing seems legal and is enough to avoid the above valgrind error. I am now testing with this fix only: ============================ --- libstdc++-v3/src/mt_allocator.cc +++ libstdc++-v3/src/mt_allocator.cc @@ -47,7 +47,8 @@ if (_M_thread_freelist_array) { __gthread_key_delete(_M_key); ::operator delete(static_cast<void*>(_M_thread_freelist_array)); + _M_thread_freelist = NULL; } } }; ============================ It seems to be on each thread-implementation to handle deleted keys and it is well handled on pthread. Laurent.
Ok, thanks. Please let us know how it goes.
In mainline, for x86_64-linux, the below patchlet indeed avoids the valgrind errors and passes make check, make check-performance. Index: src/c++98/mt_allocator.cc =================================================================== --- src/c++98/mt_allocator.cc (revision 186374) +++ src/c++98/mt_allocator.cc (working copy) @@ -48,6 +48,7 @@ { __gthread_key_delete(_M_key); ::operator delete(static_cast<void*>(_M_thread_freelist_array)); + _M_thread_freelist = 0; } } };
Thanks very much Paolo. It works fine on SH4 also. Is that enough to push the patch ? Laurent
I suppose so, yes. We could test the fix in mainline for a while and then maybe also consider having it for 4.7.1 or 4.7.2. Care to write down a short ChangeLog entry for it?
Thanks very much Paolo. Here it is. I'll commit the patch in the mainline if no objection. Laurent 2012-04-13 Laurent Alfonsi <laurent.alfonsi@st.com> PR libstdc++/52604 * src/mt_allocator.cc: (~__freelist): Reset pointer. Index: src/c++98/mt_allocator.cc =================================================================== --- src/c++98/mt_allocator.cc (revision 186374) +++ src/c++98/mt_allocator.cc (working copy) @@ -48,6 +48,7 @@ { __gthread_key_delete(_M_key); ::operator delete(static_cast<void*>(_M_thread_freelist_array)); + _M_thread_freelist = 0; } } };
(In reply to comment #14) > > PR libstdc++/52604 > * src/mt_allocator.cc: (~__freelist): Reset pointer. ^ c++98/
Yes, please go ahead, mainline only for now (PR remains open) with Jon's fix to the ChangeLog and also, between parentheses (__freelist::~__freelist).
Ah, another minor nit, remember to add 2012 to the list of Copyright Years (mind to keep the comment within 80 columns)
Author: chrbr Date: Fri Apr 13 11:44:13 2012 New Revision: 186414 Log Message: PR:52604: (~__freelist): Reset pointer Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/src/c++98/mt_allocator.cc
Author: chrbr Date: Fri Apr 13 11:58:15 2012 New Revision: 186415 Log Message: fix last entry Modified: trunk/libstdc++-v3/ChangeLog
Remember to always send the patches you commit to gcc-patches@gcc.gnu.org (and libstdc++@gcc.gnu.org in CC), even if already approved on the fly in audit trail (which should not happen very frequently) PS: are you aware that your name family name in the email address and as email sender do not match? ;)
(In reply to comment #20) > Remember to always send the patches you commit to gcc-patches@gcc.gnu.org (and > libstdc++@gcc.gnu.org in CC), even if already approved on the fly in audit > trail (which should not happen very frequently) > > PS: are you aware that your name family name in the email address and as email > sender do not match? ;) Probably my fault, we are working together and Laurent asked me to push for him, as approved. We'll start the paperwork so he can get the Write After Approval. Sorry for the confusion.
*** Bug 68200 has been marked as a duplicate of this bug. ***
According to the dup the fix is bogus (the added stmt is optimized away).