Bug 29496 - _M_invalidate function is not thread-safe in GLIBCXX_DEBUG mode
Summary: _M_invalidate function is not thread-safe in GLIBCXX_DEBUG mode
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.0.2
: P3 normal
Target Milestone: 4.2.0
Assignee: Paolo Carlini
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-18 08:33 UTC by lukasz
Modified: 2007-05-14 22:39 UTC (History)
7 users (show)

See Also:
Host: Linux 2.6.9-42.ELsmp #1 SMP
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-10-31 16:32:28


Attachments
Draft patch for the issue in Comment #8 only (916 bytes, patch)
2006-10-31 14:30 UTC, Paolo Carlini
Details | Diff
Draft patch using the _M_get_mutex idiom for _M_invalidate (1.43 KB, patch)
2006-10-31 17:01 UTC, Paolo Carlini
Details | Diff
All the functions working on a set of _Safe_iterators fixed (2.86 KB, patch)
2006-11-01 22:42 UTC, Paolo Carlini
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description lukasz 2006-10-18 08:33:26 UTC
_M_invalidate function in safe_iterator.tcc is not thread safe against _M_detach()/_M_attach() functions from _Safe_iterator_base.

Here is the example of erronous scenario:

Thread1()
{
   list::iterator iter;
   lock(mutex);
   iter = myList.begin();
   ....
   // do something with iter
   ....
   unlock();  
}

Thread2()
{
   lock(mutex);
   myList.erase(myList.begin());
   unlock();  
}

The problem in this scenario is that while calling erase(), _M_invalidate() function is called which goes through the list of all iterators without taking a lock on __gnu_internal::iterator_base_mutex. At the same time another thread can be deallocating lists iterator - thus calling _M_detach() function.
This leads to a SIGSEGV in _M_invalidate() function.

You should either remove concurrency support from GLIBCXX_DEBUG mode or provide full support.
Comment 1 Paolo Carlini 2006-10-18 09:00:57 UTC
Benjamin and Doug, can you have a look to this issue? Thanks.
Comment 2 Benjamin Kosnik 2006-10-18 09:43:11 UTC
Please attach a complete test case, not a sketch.

-benjamin
Comment 3 Paolo Carlini 2006-10-18 10:06:40 UTC
Let's suppose the issue is confirmed, and we have an useful complete testcase, then, a wild idea (at the moment I'm missing many details about the internals of the safe iterators, sorry): maybe we could explore the possibility to have the _M_invalidate functionality part of _Safe_iterator_base, thus easily use the debug mode mutex. I'm saying that because I'm under the impression that _M_invalidate basically uses info available in the _Safe_iterator_base. I'm only unsure about the tests "this->base() == __victim->base()", I hope can be done on the base objects too...
Comment 4 Paolo Carlini 2006-10-18 10:40:22 UTC
(In reply to comment #3)
>                                                                      .... I'm
> only unsure about the tests "this->base() == __victim->base()", I hope can be
> done on the base objects too...

And of course it can't, such simply, the plain iterator info is necessary here, because we want to change *all* iterators pointing the same as this->base()... In case, looks like _M_invalidate needs a more complex interaction between _Safe_iterator and _Safe_iterator_base... 

Comment 5 Grzegorz Calkowski 2006-10-18 14:40:20 UTC
I encountered this bug (?) in the past. I thought, maybe incorrectly, that accessing an interator (e.g copying it) equals read access to a container, hence no thread safety guarnatees for concurrent modifications to the container - like erase on another element.

OTOH, this works fine in non-debug mode.
Comment 6 Benjamin Kosnik 2006-10-18 15:56:34 UTC
Paolo: Seems like an interesting idea.

Grzegorz: interesting that others have run into this. Without a testcase, it's hard to say with certainty what is valid and what is invalid. Also, without a testcase there are no certainties that if fixed, it remains fixed. Thus, why I'm insisting on a testcase. Care to give it a go?

best,
benjamin
Comment 7 lukasz 2006-10-19 13:51:46 UTC
(In reply to comment #2)
> Please attach a complete test case, not a sketch.
> 
> -benjamin
> 

This bug is a pure race-condition. There is no test case which would reveal it with a 100% certainty. I have noticed the bug after receiving strange SIGSEGV in _M_invalidate() function. After reviewing the code I found the problem stated in this bug report.

Of course you can say: "No test case - no bug" but I guess this would be a very naive approach.
Comment 8 lukasz 2006-10-31 11:12:09 UTC
There is a similar problem in _M_detach_all() function. It iterates through list of iterators without obtaining lock.
Comment 9 Paolo Carlini 2006-10-31 11:38:15 UTC
To be honest, I don't think this code has been designed with thread safety in mind. Fixing it completely after the fact is going to be very complex, I'm afraid. I'll try to spend some more time on those issues, but if Doug, the original author, doesn't provide any special hint, I'm afraid we have to live with that or even, for consistency, remove the minimal MT support currently present and add something clear to the documentation.
Comment 10 Paolo Carlini 2006-10-31 11:47:28 UTC
... that said, if we don't care *at all* about performance, fixing MT-safety bugs in the _Safe_sequence or _Safe_iterator functions it's easy, just use everywhere the iterator_base_mutex. The nasty correctness problems are those involving the templatized _Safe_iterator functions, which of course are not in debug.cc.
Comment 11 Paolo Carlini 2006-10-31 11:49:04 UTC
(In reply to comment #10)
> ... that said, if we don't care *at all* about performance, fixing MT-safety
> bugs in the _Safe_sequence or _Safe_iterator functions it's easy,...

Of course I meant _Safe_sequence_base and _Safe_iterator_base, humpf.
Comment 12 Paolo Carlini 2006-10-31 12:19:41 UTC
About _M_detach_all specifically, it's called only by ~_Safe_sequence_base(), thus only when the container itself is destructed. Therefore, I don't think it may cause problems in practice.
Comment 13 lukasz 2006-10-31 13:12:32 UTC
(In reply to comment #12)
> About _M_detach_all specifically, it's called only by ~_Safe_sequence_base(),
> thus only when the container itself is destructed. Therefore, I don't think it
> may cause problems in practice.
> 

Unfortunatelly it may. Suppose there are two threads: consumer/producer where producer is creating container, taking iterator to the container and passing the container to consumer thread.  Consumer thread is simply deallocating the container - thus calling _M_detach_all function. At the same time iterator destructor is called which causes SIGSEGV.

We actually encountered this problem - this is not taken from code review only.
Comment 14 Paolo Carlini 2006-10-31 13:23:43 UTC
(In reply to comment #13)
> We actually encountered this problem - this is not taken from code review only.

Again, adding a mutex to that those *_base functions it's trivial. Now, please start preparing reduced testcases for such issues, because apprently you *know* how to do that. The approach "No test case - no bug" maybe can be considered "naive" as you say, on the other hand enforcing it eventually makes for sound software engineering, because would be even more naive to rely on submitter itself and no one else in the world to test a candidate fix, thus not being able to avoid regressions at a later stage within the project.
Comment 15 lukasz 2006-10-31 13:42:03 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > We actually encountered this problem - this is not taken from code review only.
> 
> Again, adding a mutex to that those *_base functions it's trivial. Now, please
> start preparing reduced testcases for such issues, because apprently you *know*
> how to do that. The approach "No test case - no bug" maybe can be considered
> "naive" as you say, on the other hand enforcing it eventually makes for sound
> software engineering, because would be even more naive to rely on submitter
> itself and no one else in the world to test a candidate fix, thus not being
> able to avoid regressions at a later stage within the project.
> 

I tried to prepare a simple testcase but without any success. It seems to be extremely hard to hit the moment when pointers are actually changed. Trivial programs do not reveal the problem.
Comment 16 Paolo Carlini 2006-10-31 13:45:58 UTC
(In reply to comment #15)
> I tried to prepare a simple testcase but without any success. It seems to be
> extremely hard to hit the moment when pointers are actually changed. Trivial
> programs do not reveal the problem.

Ok, but you should try harder, because often in the past we managed to prepare complete testcases for races and generally MT issues too, which seemed very tricky at the outset. 

Comment 17 Paolo Carlini 2006-10-31 14:30:47 UTC
Created attachment 12517 [details]
Draft patch for the issue in Comment #8 only

For the last issue, can you try this patch? If it works, then probably we can incrementally commit something similar, because it's decently clean and the performance should not be affected, even slightly improved.
Comment 18 Benjamin Kosnik 2006-10-31 14:37:20 UTC
> I tried to prepare a simple testcase but without any success.

This is encouraging, that you attempted it. To be clear: we just need something that we can use to reproduce it. Please try for a complex or big testcase then, and we'll try to reduce it ourselves.

We have many testcases that reproduce races. See testsuite/threads.

-benjamin
Comment 19 Paolo Carlini 2006-10-31 17:01:07 UTC
Created attachment 12520 [details]
Draft patch using the _M_get_mutex idiom for _M_invalidate

This one, instead, using the _M_get_mutex idiom (similarly to pool_allocator, for example), should be able to deal with the _M_invalidate issue too. Agreed, in debug-mode performance isn't the main concern, still I'm a bit worried, because we have got one single mutex for the entire STL. But I have no idea how this problem can be solved at this stage, the ABI still frozen. Besides that general issue, the approach should work, maybe locking is needed also for some _Safe_sequence functions: in that case _Safe_iterator_base::_M_get_mutex would be still usable, being public, or, to be cleaner, an _M_get_mutex method could be added to _Safe_sequence_base too. In any case, I would rather wait for testcases before touching _Safe_sequence...

Benjamin, what do you think?
Comment 20 lukasz 2006-10-31 17:23:53 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > I tried to prepare a simple testcase but without any success. It seems to be
> > extremely hard to hit the moment when pointers are actually changed. Trivial
> > programs do not reveal the problem.
> 
> Ok, but you should try harder, because often in the past we managed to prepare
> complete testcases for races and generally MT issues too, which seemed very
> tricky at the outset. 
> 

I think we are totally misunderstood here. I don't have time to prepare complex test cases revealing problems not in my code. I have found the bug I even found the cause of the bug and it is still not enough for you.

If you don't want to be informed about problems in stl then what is this bugzilla for?

I have already made a workaround in my code to avoid the debug container in certain situations so it's not me who will actually benefit from the fix.
I reported this issue to help other developers who might encounter similar problem.
Because of version policies in my company probably I won't be even able to use your fix.
Comment 21 Paolo Carlini 2006-10-31 23:53:30 UTC
(In reply to comment #20)
> I think we are totally misunderstood here.

No, no, there is something *completely* different to it. Sorry, but I'm now convinced that you are here with the *wrong* spirit, aren't technical matters to divide us. At all. Did you ever read any of Stallman' papers? Try, just a little  harder than you are willing to to do to help GCC improving, and you will see that you got the Free Software spirit wrong:

  http://www.fsf.org/
Comment 22 Paolo Carlini 2006-11-01 22:42:28 UTC
Created attachment 12531 [details]
All the functions working on a set of _Safe_iterators fixed

In my present understanding *all* the functions belonging to _Safe_iterator_base, _Safe_iterator, _Safe_sequence_base, and _Safe_sequence going through a full set of iterators attached to a container are currently weak wrt the scenario presented in this PR, where _M_detach is called by iterator destructor. This new draft patch reflects that, consistently. I'm currently testing it, so far everything seems ok on x86-linux and ia64-linux four-way.
Comment 23 paolo@gcc.gnu.org 2006-11-11 17:32:30 UTC
Subject: Bug 29496

Author: paolo
Date: Sat Nov 11 17:32:12 2006
New Revision: 118701

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118701
Log:
2006-11-11  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/29496
	* include/debug/safe_base.h (_Safe_sequence_base::_M_get_mutex,
	_Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single):
	New.
	* src/debug.cc: Define the latter.
	(_Safe_sequence_base::_M_detach_all, _M_detach_singular,
	_M_revalidate_singular, _M_swap): Use the mutex.
	(_Safe_iterator_base::_M_attach, _M_detach): Adjust, forward to the
	*_single version.
	* include/debug/safe_iterator.h (_Safe_iterator<>::_M_attach_single,
	_M_invalidate_single): New.
 	* include/debug/safe_iterator.tcc: Define.
	(_Safe_iterator<>::_M_invalidate): Adjust, forward to
	_M_invalidate_single.
	* include/debug/safe_sequence.h (_Safe_sequence<>::_M_invalidate_if,
	_M_transfer_iter): Use the mutex, adjust, forward to the *_single
	versions of _M_invalidate and _M_attach.
	* config/abi/pre/gnu.ver (_Safe_sequence_base::_M_get_mutex,
	_Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single):
	Add @GLIBCXX_3.4.10; adjust.
	* configure.ac (libtool_VERSION): To 6:10:0.
	* testsuite/util/testsuite_abi.cc (check_version): Add GLIBCXX_3.4.10.
	* configure: Regenerate.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/config/abi/pre/gnu.ver
    trunk/libstdc++-v3/configure
    trunk/libstdc++-v3/configure.ac
    trunk/libstdc++-v3/include/debug/safe_base.h
    trunk/libstdc++-v3/include/debug/safe_iterator.h
    trunk/libstdc++-v3/include/debug/safe_iterator.tcc
    trunk/libstdc++-v3/include/debug/safe_sequence.h
    trunk/libstdc++-v3/src/debug.cc
    trunk/libstdc++-v3/testsuite/util/testsuite_abi.cc

Comment 24 paolo@gcc.gnu.org 2007-02-01 15:56:57 UTC
Subject: Bug 29496

Author: paolo
Date: Thu Feb  1 15:56:37 2007
New Revision: 121465

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=121465
Log:
2007-02-01  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/14493
	* libsupc++/typeinfo (bad_cast::what, bad_typeid::what): Declare.
	* libsupc++/tinfo.cc: Define.	
	* libsupc++/exception (bad_exception::what): Declare.
	* libsupc++/eh_exception.cc: Define.
	(exception::what): Adjust, don't use typeid.
	* libsupc++/new (bad_alloc::what): Declare.
	* libsupc++/new_handler.cc: Define.
	* config/abi/pre/gnu.ver: Export the new methods @3.4.9.
	* testsuite/18_support/14493.cc: New.

2007-02-01  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/29496
	* include/debug/safe_base.h (_Safe_sequence_base::_M_get_mutex,
	_Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single):
	New.
	* src/debug.cc: Define the latter.
	(_Safe_sequence_base::_M_detach_all, _M_detach_singular,
	_M_revalidate_singular, _M_swap): Use the mutex.
	(_Safe_iterator_base::_M_attach, _M_detach): Adjust, forward to the
	*_single version.
	* include/debug/safe_iterator.h (_Safe_iterator<>::_M_attach_single,
	_M_invalidate_single): New.
 	* include/debug/safe_iterator.tcc: Define.
	(_Safe_iterator<>::_M_invalidate): Adjust, forward to
	_M_invalidate_single.
	* include/debug/safe_sequence.h (_Safe_sequence<>::_M_invalidate_if,
	_M_transfer_iter): Use the mutex, adjust, forward to the *_single
	versions of _M_invalidate and _M_attach.
	* config/abi/pre/gnu.ver (_Safe_sequence_base::_M_get_mutex,
	_Safe_iterator_base::_M_get_mutex, _M_attach_single, _M_detach_single):
	Add @GLIBCXX_3.4.9; adjust.

Added:
    branches/gcc-4_2-branch/libstdc++-v3/testsuite/18_support/14493.cc
Modified:
    branches/gcc-4_2-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_2-branch/libstdc++-v3/config/abi/pre/gnu.ver
    branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_base.h
    branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_iterator.h
    branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_iterator.tcc
    branches/gcc-4_2-branch/libstdc++-v3/include/debug/safe_sequence.h
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/eh_exception.cc
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/exception
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/new
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/new_handler.cc
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/tinfo.cc
    branches/gcc-4_2-branch/libstdc++-v3/libsupc++/typeinfo
    branches/gcc-4_2-branch/libstdc++-v3/src/debug.cc

Comment 25 Mark Mitchell 2007-05-14 22:26:44 UTC
Will not be fixed in 4.2.0; retargeting at 4.2.1.
Comment 26 Paolo Carlini 2007-05-14 22:30:40 UTC
Actually, this is already fixed for 4.2.0! How to mark it as such?
Comment 27 Paolo Carlini 2007-05-14 22:39:30 UTC
Ok, fixed the target milestone.