This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/21334] New: Lack of Posix compliant thread safety in std::basic_string
- From: "jkanze at cheuvreux dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 2 May 2005 11:45:30 -0000
- Subject: [Bug libstdc++/21334] New: Lack of Posix compliant thread safety in std::basic_string
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
I am sending this to the g++ bug list on the recommendation of
Gabriel Dos Reis. From what little I've read in the g++
documentation, I'm not convinced that the authors of the g++
library intend for it to be supported, although Posix would seem
to require it.
For the record, the statement in Posix is: "Applications shall
ensure that access to any memory location by more than one
thread of control (threads or processes) is restricted such that
no thread of control can read or modify a memory location while
another thread of control may be modifying it." The obvious
extension to C++ is that of replacing "memory location" with
"object"; at the very least, of course, one can only require
something of "memory locations" which the user sees, directly or
indirectly. The statement in the libstdc++-v3 FAQ (question
5.6) is: "All library objects are safe to use in a multithreaded
program as long as each thread carefully locks out access by any
other thread while it uses any object visible to another thread,
i.e., treat library objects like any other shared resource. In
general, this requirement includes both read and write access to
objects; unless otherwise documented as safe, do not assume that
two threads may access a shared standard library object at the
same time." A considerably weaker guarantee than what one
normally expects under Posix. (Note that the clause "like any
other shared resource" is simply false for those of us used to
the Posix model. If I replace std::string with char[] in my
code below, the behavior is perfectly defined under Posix.)
The following is an example of a program which may cause
problems:
#include <pthread.h>
#include <iostream>
#include <ostream>
#include <string>
std::string g ;
extern "C" void*
thread1(
void* lock )
{
std::string t( g ) ;
pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
std::cout << t << '\n' ;
pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;
return NULL ;
}
extern "C" void*
thread2(
void* lock )
{
std::string t( g.begin(), g.end() ) ;
pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
std::cout << t << '\n' ;
pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;
return NULL ;
}
int
main()
{
g = "0123456789" ;
pthread_mutex_t lock ;
pthread_mutex_init( &lock, NULL ) ;
pthread_t t1 ;
if ( pthread_create( &t1, NULL, &thread1, &lock ) != 0 ) {
std::cerr << "Could not create thread1" << std::endl ;
}
pthread_t t2 ;
if ( pthread_create( &t2, NULL, &thread2, &lock ) != 0 ) {
std::cerr << "Could not create thread }
pthread_join( t1, NULL ) ;
pthread_join( t2, NULL ) ;
return 0 ;
}
Consider the following scenario:
-- Thread 2 executes first. It gets to the expression
g.begin() (which for purposes of argument, we will suppose
is called before g.end() -- the problem will occur in which
ever function is called first), and starts executing it.
At this point, the value _M_refcount in the _Rep_base is 0,
since there is only one instance, g, which shares the
representation. The representation is not "leaked", so we
call _M_leak_hard.
_M_leak_hard calls _M_rep()->_M_is_shared(), which returns
false.
-- Thread 1 interupts. Thread 2 calls the copy constructor,
with g as a parameter, which ultimately calls _M_grab on the
_M_is_leaked() returns false, since the _M_refcount is still
0 in the representation. Thread 2 thus calls _M_refcopy()
on the representation, which (atomically) increments
_M_refcount. Thread 1 leaves the copy constructor.
-- Now back to thread 2. Since _M_is_shared() returned false,
thread 2 doesn't call _M_mutate -- is simply calls
_M_set_leaked() on the representation, which sets
_M_refcount to -1.
We will suppose that this modification is visible to all
other threads, despite the fact that there are no memory
barriers around it (which means that this supposition will
be false on certain platforms).
-- And life goes on. The second call to begin()/end() doesn't
change anything, because it finds that the representation is
already "leaked".
-- Finally, suppose that thread 1 finishes while thread 1 is
still using its iterators. Thread 1 calls the destructor
for its string. It sees that _M_refcount < 0, concludes
that the representation is leaked, and deletes it. Despite
the fact that thread 2 still holds iterators refering to it,
and despite the fact that there is still a global variable
(usable after the pthread_joins in main) which depends on
it.
The problem is, of course, that the sequence which tests whether
we have to leak, and then leaks, is not atomic.
--
Summary: Lack of Posix compliant thread safety in
std::basic_string
Product: gcc
Version: 3.4.3
Status: UNCONFIRMED
Severity: minor
Priority: P2
Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jkanze at cheuvreux dot com
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: sparc-sun-solaris2.8
GCC host triplet: sparc-sun-solaris2.8
GCC target triplet: sparc-sun-solaris2.8
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334