_M_invalidate function in safe_iterator.tcc is not thread safe against _M_detach()/_M_attach() functions from _Safe_iterator_base. Here is the example of erronous scenario: Thread1() { list::iterator iter; lock(mutex); iter = myList.begin(); .... // do something with iter .... unlock(); } Thread2() { lock(mutex); myList.erase(myList.begin()); unlock(); } The problem in this scenario is that while calling erase(), _M_invalidate() function is called which goes through the list of all iterators without taking a lock on __gnu_internal::iterator_base_mutex. At the same time another thread can be deallocating lists iterator - thus calling _M_detach() function. This leads to a SIGSEGV in _M_invalidate() function. You should either remove concurrency support from GLIBCXX_DEBUG mode or provide full support.
Benjamin and Doug, can you have a look to this issue? Thanks.
Please attach a complete test case, not a sketch. -benjamin
Let's suppose the issue is confirmed, and we have an useful complete testcase, then, a wild idea (at the moment I'm missing many details about the internals of the safe iterators, sorry): maybe we could explore the possibility to have the _M_invalidate functionality part of _Safe_iterator_base, thus easily use the debug mode mutex. I'm saying that because I'm under the impression that _M_invalidate basically uses info available in the _Safe_iterator_base. I'm only unsure about the tests "this->base() == __victim->base()", I hope can be done on the base objects too...
(In reply to comment #3) > .... I'm > only unsure about the tests "this->base() == __victim->base()", I hope can be > done on the base objects too... And of course it can't, such simply, the plain iterator info is necessary here, because we want to change *all* iterators pointing the same as this->base()... In case, looks like _M_invalidate needs a more complex interaction between _Safe_iterator and _Safe_iterator_base...
I encountered this bug (?) in the past. I thought, maybe incorrectly, that accessing an interator (e.g copying it) equals read access to a container, hence no thread safety guarnatees for concurrent modifications to the container - like erase on another element. OTOH, this works fine in non-debug mode.
Paolo: Seems like an interesting idea. Grzegorz: interesting that others have run into this. Without a testcase, it's hard to say with certainty what is valid and what is invalid. Also, without a testcase there are no certainties that if fixed, it remains fixed. Thus, why I'm insisting on a testcase. Care to give it a go? best, benjamin
(In reply to comment #2) > Please attach a complete test case, not a sketch. > > -benjamin > This bug is a pure race-condition. There is no test case which would reveal it with a 100% certainty. I have noticed the bug after receiving strange SIGSEGV in _M_invalidate() function. After reviewing the code I found the problem stated in this bug report. Of course you can say: "No test case - no bug" but I guess this would be a very naive approach.
There is a similar problem in _M_detach_all() function. It iterates through list of iterators without obtaining lock.
To be honest, I don't think this code has been designed with thread safety in mind. Fixing it completely after the fact is going to be very complex, I'm afraid. I'll try to spend some more time on those issues, but if Doug, the original author, doesn't provide any special hint, I'm afraid we have to live with that or even, for consistency, remove the minimal MT support currently present and add something clear to the documentation.
... that said, if we don't care *at all* about performance, fixing MT-safety bugs in the _Safe_sequence or _Safe_iterator functions it's easy, just use everywhere the iterator_base_mutex. The nasty correctness problems are those involving the templatized _Safe_iterator functions, which of course are not in debug.cc.
(In reply to comment #10) > ... that said, if we don't care *at all* about performance, fixing MT-safety > bugs in the _Safe_sequence or _Safe_iterator functions it's easy,... Of course I meant _Safe_sequence_base and _Safe_iterator_base, humpf.
About _M_detach_all specifically, it's called only by ~_Safe_sequence_base(), thus only when the container itself is destructed. Therefore, I don't think it may cause problems in practice.
(In reply to comment #12) > About _M_detach_all specifically, it's called only by ~_Safe_sequence_base(), > thus only when the container itself is destructed. Therefore, I don't think it > may cause problems in practice. > Unfortunatelly it may. Suppose there are two threads: consumer/producer where producer is creating container, taking iterator to the container and passing the container to consumer thread. Consumer thread is simply deallocating the container - thus calling _M_detach_all function. At the same time iterator destructor is called which causes SIGSEGV. We actually encountered this problem - this is not taken from code review only.
(In reply to comment #13) > We actually encountered this problem - this is not taken from code review only. Again, adding a mutex to that those *_base functions it's trivial. Now, please start preparing reduced testcases for such issues, because apprently you *know* how to do that. The approach "No test case - no bug" maybe can be considered "naive" as you say, on the other hand enforcing it eventually makes for sound software engineering, because would be even more naive to rely on submitter itself and no one else in the world to test a candidate fix, thus not being able to avoid regressions at a later stage within the project.
(In reply to comment #14) > (In reply to comment #13) > > We actually encountered this problem - this is not taken from code review only. > > Again, adding a mutex to that those *_base functions it's trivial. Now, please > start preparing reduced testcases for such issues, because apprently you *know* > how to do that. The approach "No test case - no bug" maybe can be considered > "naive" as you say, on the other hand enforcing it eventually makes for sound > software engineering, because would be even more naive to rely on submitter > itself and no one else in the world to test a candidate fix, thus not being > able to avoid regressions at a later stage within the project. > I tried to prepare a simple testcase but without any success. It seems to be extremely hard to hit the moment when pointers are actually changed. Trivial programs do not reveal the problem.
(In reply to comment #15) > I tried to prepare a simple testcase but without any success. It seems to be > extremely hard to hit the moment when pointers are actually changed. Trivial > programs do not reveal the problem. Ok, but you should try harder, because often in the past we managed to prepare complete testcases for races and generally MT issues too, which seemed very tricky at the outset.
Created attachment 12517 [details] Draft patch for the issue in Comment #8 only For the last issue, can you try this patch? If it works, then probably we can incrementally commit something similar, because it's decently clean and the performance should not be affected, even slightly improved.
> I tried to prepare a simple testcase but without any success. This is encouraging, that you attempted it. To be clear: we just need something that we can use to reproduce it. Please try for a complex or big testcase then, and we'll try to reduce it ourselves. We have many testcases that reproduce races. See testsuite/threads. -benjamin
Created attachment 12520 [details] Draft patch using the _M_get_mutex idiom for _M_invalidate This one, instead, using the _M_get_mutex idiom (similarly to pool_allocator, for example), should be able to deal with the _M_invalidate issue too. Agreed, in debug-mode performance isn't the main concern, still I'm a bit worried, because we have got one single mutex for the entire STL. But I have no idea how this problem can be solved at this stage, the ABI still frozen. Besides that general issue, the approach should work, maybe locking is needed also for some _Safe_sequence functions: in that case _Safe_iterator_base::_M_get_mutex would be still usable, being public, or, to be cleaner, an _M_get_mutex method could be added to _Safe_sequence_base too. In any case, I would rather wait for testcases before touching _Safe_sequence... Benjamin, what do you think?
(In reply to comment #16) > (In reply to comment #15) > > I tried to prepare a simple testcase but without any success. It seems to be > > extremely hard to hit the moment when pointers are actually changed. Trivial > > programs do not reveal the problem. > > Ok, but you should try harder, because often in the past we managed to prepare > complete testcases for races and generally MT issues too, which seemed very > tricky at the outset. > I think we are totally misunderstood here. I don't have time to prepare complex test cases revealing problems not in my code. I have found the bug I even found the cause of the bug and it is still not enough for you. If you don't want to be informed about problems in stl then what is this bugzilla for? I have already made a workaround in my code to avoid the debug container in certain situations so it's not me who will actually benefit from the fix. I reported this issue to help other developers who might encounter similar problem. Because of version policies in my company probably I won't be even able to use your fix.
(In reply to comment #20) > I think we are totally misunderstood here. No, no, there is something *completely* different to it. Sorry, but I'm now convinced that you are here with the *wrong* spirit, aren't technical matters to divide us. At all. Did you ever read any of Stallman' papers? Try, just a little harder than you are willing to to do to help GCC improving, and you will see that you got the Free Software spirit wrong: http://www.fsf.org/
Created attachment 12531 [details] All the functions working on a set of _Safe_iterators fixed In my present understanding *all* the functions belonging to _Safe_iterator_base, _Safe_iterator, _Safe_sequence_base, and _Safe_sequence going through a full set of iterators attached to a container are currently weak wrt the scenario presented in this PR, where _M_detach is called by iterator destructor. This new draft patch reflects that, consistently. I'm currently testing it, so far everything seems ok on x86-linux and ia64-linux four-way.
Subject: Bug 29496 Author: paolo Date: Sat Nov 11 17:32:12 2006 New Revision: 118701 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118701 Log: 2006-11-11 Paolo Carlini <pcarlini@suse.de> PR libstdc++/29496 * include/debug/safe_base.h (_Safe_sequence_base::_M_get_mutex, _Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single): New. * src/debug.cc: Define the latter. (_Safe_sequence_base::_M_detach_all, _M_detach_singular, _M_revalidate_singular, _M_swap): Use the mutex. (_Safe_iterator_base::_M_attach, _M_detach): Adjust, forward to the *_single version. * include/debug/safe_iterator.h (_Safe_iterator<>::_M_attach_single, _M_invalidate_single): New. * include/debug/safe_iterator.tcc: Define. (_Safe_iterator<>::_M_invalidate): Adjust, forward to _M_invalidate_single. * include/debug/safe_sequence.h (_Safe_sequence<>::_M_invalidate_if, _M_transfer_iter): Use the mutex, adjust, forward to the *_single versions of _M_invalidate and _M_attach. * config/abi/pre/gnu.ver (_Safe_sequence_base::_M_get_mutex, _Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single): Add @GLIBCXX_3.4.10; adjust. * configure.ac (libtool_VERSION): To 6:10:0. * testsuite/util/testsuite_abi.cc (check_version): Add GLIBCXX_3.4.10. * configure: Regenerate. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/config/abi/pre/gnu.ver trunk/libstdc++-v3/configure trunk/libstdc++-v3/configure.ac trunk/libstdc++-v3/include/debug/safe_base.h trunk/libstdc++-v3/include/debug/safe_iterator.h trunk/libstdc++-v3/include/debug/safe_iterator.tcc trunk/libstdc++-v3/include/debug/safe_sequence.h trunk/libstdc++-v3/src/debug.cc trunk/libstdc++-v3/testsuite/util/testsuite_abi.cc
Subject: Bug 29496 Author: paolo Date: Thu Feb 1 15:56:37 2007 New Revision: 121465 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=121465 Log: 2007-02-01 Paolo Carlini <pcarlini@suse.de> PR libstdc++/14493 * libsupc++/typeinfo (bad_cast::what, bad_typeid::what): Declare. * libsupc++/tinfo.cc: Define. * libsupc++/exception (bad_exception::what): Declare. * libsupc++/eh_exception.cc: Define. (exception::what): Adjust, don't use typeid. * libsupc++/new (bad_alloc::what): Declare. * libsupc++/new_handler.cc: Define. * config/abi/pre/gnu.ver: Export the new methods @3.4.9. * testsuite/18_support/14493.cc: New. 2007-02-01 Paolo Carlini <pcarlini@suse.de> PR libstdc++/29496 * include/debug/safe_base.h (_Safe_sequence_base::_M_get_mutex, _Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single): New. * src/debug.cc: Define the latter. (_Safe_sequence_base::_M_detach_all, _M_detach_singular, _M_revalidate_singular, _M_swap): Use the mutex. (_Safe_iterator_base::_M_attach, _M_detach): Adjust, forward to the *_single version. * include/debug/safe_iterator.h (_Safe_iterator<>::_M_attach_single, _M_invalidate_single): New. * include/debug/safe_iterator.tcc: Define. (_Safe_iterator<>::_M_invalidate): Adjust, forward to _M_invalidate_single. * include/debug/safe_sequence.h (_Safe_sequence<>::_M_invalidate_if, _M_transfer_iter): Use the mutex, adjust, forward to the *_single versions of _M_invalidate and _M_attach. * config/abi/pre/gnu.ver (_Safe_sequence_base::_M_get_mutex, _Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single): Add @GLIBCXX_3.4.9; adjust. Added: branches/gcc-4_2-branch/libstdc++-v3/testsuite/18_support/14493.cc Modified: branches/gcc-4_2-branch/libstdc++-v3/ChangeLog branches/gcc-4_2-branch/libstdc++-v3/config/abi/pre/gnu.ver branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_base.h branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_iterator.h branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_iterator.tcc branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_sequence.h branches/gcc-4_2-branch/libstdc++-v3/libsupc++/eh_exception.cc branches/gcc-4_2-branch/libstdc++-v3/libsupc++/exception branches/gcc-4_2-branch/libstdc++-v3/libsupc++/new branches/gcc-4_2-branch/libstdc++-v3/libsupc++/new_handler.cc branches/gcc-4_2-branch/libstdc++-v3/libsupc++/tinfo.cc branches/gcc-4_2-branch/libstdc++-v3/libsupc++/typeinfo branches/gcc-4_2-branch/libstdc++-v3/src/debug.cc
Will not be fixed in 4.2.0; retargeting at 4.2.1.
Actually, this is already fixed for 4.2.0! How to mark it as such?
Ok, fixed the target milestone.