Bug 21334 - Lack of Posix compliant thread safety in std::basic_string
Summary: Lack of Posix compliant thread safety in std::basic_string
Status: SUSPENDED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 3.4.3
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 10350 32261 (view as bug list)
Depends on: 24882 33394
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-02 11:45 UTC by James Kanze
Modified: 2015-03-23 13:01 UTC (History)
12 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-05-08 19:47:19


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description James Kanze 2005-05-02 11:45:19 UTC
I am sending this to the g++ bug list on the recommendation of
Gabriel Dos Reis.  From what little I've read in the g++
documentation, I'm not convinced that the authors of the g++
library intend for it to be supported, although Posix would seem
to require it.

For the record, the statement in Posix is: "Applications shall
ensure that access to any memory location by more than one
thread of control (threads or processes) is restricted such that
no thread of control can read or modify a memory location while
another thread of control may be modifying it."  The obvious
extension to C++ is that of replacing "memory location" with
"object"; at the very least, of course, one can only require
something of "memory locations" which the user sees, directly or
indirectly.  The statement in the libstdc++-v3 FAQ (question
5.6) is: "All library objects are safe to use in a multithreaded
program as long as each thread carefully locks out access by any
other thread while it uses any object visible to another thread,
i.e., treat library objects like any other shared resource. In
general, this requirement includes both read and write access to
objects; unless otherwise documented as safe, do not assume that
two threads may access a shared standard library object at the
same time."  A considerably weaker guarantee than what one
normally expects under Posix.  (Note that the clause "like any
other shared resource" is simply false for those of us used to
the Posix model.  If I replace std::string with char[] in my
code below, the behavior is perfectly defined under Posix.)

The following is an example of a program which may cause
problems:

    #include <pthread.h>
    #include <iostream>
    #include <ostream>
    #include <string>

    std::string         g ;

    extern "C" void*
    thread1(
        void*               lock )
    {
        std::string         t( g ) ;
        pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
        std::cout << t << '\n' ;
        pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;
        return NULL ;
    }

    extern "C" void*
    thread2(
        void*               lock )
    {
        std::string         t( g.begin(), g.end() ) ;
        pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
        std::cout << t << '\n' ;
        pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;
        return NULL ;
    }

    int
    main()
    {
        g = "0123456789" ;
        pthread_mutex_t     lock ;
        pthread_mutex_init( &lock, NULL ) ;
        pthread_t           t1 ;
        if ( pthread_create( &t1, NULL, &thread1, &lock ) != 0 ) {
            std::cerr << "Could not create thread1" << std::endl ;
        }
        pthread_t           t2 ;
        if ( pthread_create( &t2, NULL, &thread2, &lock ) != 0 ) {
            std::cerr << "Could not create thread    }
        pthread_join( t1, NULL ) ;
        pthread_join( t2, NULL ) ;
        return 0 ;
    }

Consider the following scenario:

 -- Thread 2 executes first.  It gets to the expression
    g.begin() (which for purposes of argument, we will suppose
    is called before g.end() -- the problem will occur in which
    ever function is called first), and starts executing it.

    At this point, the value _M_refcount in the _Rep_base is 0,
    since there is only one instance, g, which shares the
    representation.  The representation is not "leaked", so we
    call _M_leak_hard.

    _M_leak_hard calls _M_rep()->_M_is_shared(), which returns
    false.

 -- Thread 1 interupts.  Thread 2 calls the copy constructor,
    with g as a parameter, which ultimately calls _M_grab on the
    _M_is_leaked() returns false, since the _M_refcount is still
    0 in the representation.  Thread 2 thus calls _M_refcopy()
    on the representation, which (atomically) increments
    _M_refcount.  Thread 1 leaves the copy constructor.

 -- Now back to thread 2.   Since _M_is_shared() returned false,
    thread 2 doesn't call _M_mutate -- is simply calls
    _M_set_leaked() on the representation, which sets
    _M_refcount to -1.

    We will suppose that this modification is visible to all
    other threads, despite the fact that there are no memory
    barriers around it (which means that this supposition will
    be false on certain platforms).

 -- And life goes on.  The second call to begin()/end() doesn't
    change anything, because it finds that the representation is
    already "leaked".

 -- Finally, suppose that thread 1 finishes while thread 1 is
    still using its iterators.  Thread 1 calls the destructor
    for its string.  It sees that _M_refcount < 0, concludes
    that the representation is leaked, and deletes it.  Despite
    the fact that thread 2 still holds iterators refering to it,
    and despite the fact that there is still a global variable
    (usable after the pthread_joins in main) which depends on
    it.

The problem is, of course, that the sequence which tests whether
we have to leak, and then leaks, is not atomic.
Comment 1 Andrew Pinski 2005-05-02 11:54:58 UTC
Is this a duplicate of bug 10350?
Comment 2 Paolo Carlini 2005-05-02 12:02:03 UTC
Two quick comments: 1- I'd like to keep open either 10350 or this one, I don't
see much value in keeping open both. Ok? 2- I'm not aware of any real cure for
this kind of problems within a RC implementation. Are you? For what is worth,
in the v7-branch there is already a basic_string framework for flexible memory
management policies and I have already prototyped an additional policy not using
RC at all. I'm planning to add it the repository very soon. This is stuff that
breaks the 3.4/4.0 ABI of course, but if people want it we can envisage providing
it as an extension in future 4.0.x releases.
Comment 3 James Kanze 2005-05-02 13:22:48 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


Looks like it.  The example function at the user level isn't the
same, but the basic problem is.

I'd forgotten I ever sent the first one.  At some point, I read
somewhere that G++ didn't want to support multiple accesses
from different threads, even if there was no modification.
I'd mentionned this in various news groups, and Gabriel
Dos Reis took me to the task: he said that if there was no
modification in either thread, I should submit it as a bug.

I guess my real question is the same one with which the
previous bug ended: is this an error, or is this the intended
behavior. If it is an error in the code, then it needs correcting
not only in the code, but in the text which I cited from the library
FAQ.  (If it is the intended behavior, of course, you're going to
hear a lot of complaints from those of us used to the Posix
model:-).)

--
James

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 4 Gabriel Dos Reis 2005-05-02 13:27:31 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| Two quick comments: 1- I'd like to keep open either 10350 or this
| one, I don't see much value in keeping open both.

I don't see much value in discarding either of them.  Both provide
complimentary information and analysis.  Whatever you do, do make sure
that none of them disappear.  I would have preference to keep this one
open because it raises another question regarding our documentation.

| Ok? 2- I'm not aware of any real cure for
| this kind of problems within a RC implementation. Are you?

Beside death to COW? ;->

| For what is worth,
| in the v7-branch there is already a basic_string framework for flexible memory
| management policies and I have already prototyped an additional policy not using
| RC at all. I'm planning to add it the repository very soon. This is stuff that
| breaks the 3.4/4.0 ABI of course, but if people want it we can envisage providing
| it as an extension in future 4.0.x releases.

-- Gaby
Comment 5 James Kanze 2005-05-02 13:30:10 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> Two quick comments: 1- I'd like to keep open either 10350 or
|> this one, I don't see much value in keeping open both. Ok?

In that case, I'd go for this one.  It mentions one of the
places where g++ documentation suggests that this shouldn't
work.  And it mentions the Posix standard as an argument as to
why it should.

|> 2- I'm not aware of any real cure for this kind of problems
|> within a RC implementation. Are you?

pthread_mutex_lock:-).  Of course, performance will take a
serious hit.

--
James Kanze                              mailto:james.kanze@free.fr
Conseils en informatique orientée objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 6 Paolo Carlini 2005-05-02 13:31:43 UTC
Ok, thanks, let's keep open this one, then.
Comment 7 Paolo Carlini 2005-05-02 13:32:30 UTC
*** Bug 10350 has been marked as a duplicate of this bug. ***
Comment 8 Gabriel Dos Reis 2005-05-02 14:09:41 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From pcarlini at suse dot de  2005-05-02 13:31 -------
| Ok, thanks, let's keep open this one, then.
| 
| -- 
|            What    |Removed                     |Added
| ----------------------------------------------------------------------------
|            Severity|minor                       |enhancement

Isn't this a bug as opposed to "enhancement"?  Enhancement suggests
that the behaviour is basically correct, but could be improved. 

-- Gaby
Comment 9 Paolo Carlini 2005-05-02 14:20:04 UTC
Does the C++ standard mention multithreading and Posix threads? ;)
Comment 10 Gabriel Dos Reis 2005-05-02 14:42:01 UTC
(In reply to comment #9)
> Does the C++ standard mention multithreading and Posix threads? ;)

That is a very uninteresting question.  You're quite right that the
C++ standard does not mention usability, though.
Comment 11 rittle@latour.waar.labs.mot.com 2005-05-02 16:02:14 UTC
Subject: Re:  New: Lack of Posix compliant thread safety in
	std::basic_string

>I am sending this to the g++ bug list on the recommendation of
>Gabriel Dos Reis.  From what little I've read in the g++
>documentation, I'm not convinced that the authors of the g++
>library intend for it to be supported, although Posix would seem
>to require it.

Firstly, POSIX says nothing about C++ thus my confusion starts here.
Secondly, it is clear that your bug report is hypothetical.  The
library maintainers do not typically deal in hypotheticals.

>For the record, the statement in Posix is: "Applications shall
>ensure that access to any memory location by more than one
>thread of control (threads or processes) is restricted such that
>no thread of control can read or modify a memory location while
>another thread of control may be modifying it."  The obvious
>extension to C++ is that of replacing "memory location" with
>"object"; at the very least, of course, one can only require
>something of "memory locations" which the user sees, directly or
>indirectly.

OK.

>The statement in the libstdc++-v3 FAQ (question
>5.6) is: "All library objects are safe to use in a multithreaded
>program as long as each thread carefully locks out access by any
>other thread while it uses any object visible to another thread,
>i.e., treat library objects like any other shared resource. In
>general, this requirement includes both read and write access to
>objects; unless otherwise documented as safe, do not assume that
>two threads may access a shared standard library object at the
>same time."

OK, I seem to recall editing that statement at one point...

>A considerably weaker guarantee than what one
>normally expects under Posix.  (Note that the clause "like any
>other shared resource" is simply false for those of us used to
>the Posix model.  If I replace std::string with char[] in my
>code below, the behavior is perfectly defined under Posix.)

No, confusion.  We are guaranteeing that if you lock the visible
accesses to global or otherwise shared objects, that no POSIX
threading rule will be violated within our own library code.  We are
saying that we guarantee that any internal shared objects (which may
not be visible to the user, thus not lockable by the user) will be
correctly handled per POSIX threading rules.

>The following is an example of a program which may cause
>problems:
>
>    #include <pthread.h>
>    #include <iostream>
>    #include <ostream>
>    #include <string>
>
>    std::string         g ;

Please note that g is a global object which you have shared between
two tasks.  Thus the words in the FAQ are operative unless another
guarantee is documented as operative.

>    extern "C" void*
>    thread1(
>        void*               lock )
>    {
>        std::string         t( g ) ;

This is read access of shared, global g thus incorrect usage according
to the general contract that we offered you.

>        pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
>        std::cout << t << '\n' ;
>        pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;
>        return NULL ;
>    }
>
>    extern "C" void*
>    thread2(
>        void*               lock )
>    {
>        std::string         t( g.begin(), g.end() ) ;

This is read access of shared, global g thus this is incorrect usage
according to the general contract that we offered you.

There may or may not be additional documentation covering the
std::string<> implementation which implies that the above usage should
work.  In fact, I think there is such documentation.  However, you
should refer to that documentation not the general FAQ about all other
library objects.

>        pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ;
>        std::cout << t << '\n' ;
>        pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ;

I note that you used the correct idiom implied by the FAW to lock
shared access to the gloabl std::cout.  Why do you consider g to be
different?

>        return NULL ;
>    }
>
>    int
>    main()
>    {
>        g = "0123456789" ;
>        pthread_mutex_t     lock ;
>        pthread_mutex_init( &lock, NULL ) ;
>        pthread_t           t1 ;
>        if ( pthread_create( &t1, NULL, &thread1, &lock ) != 0 ) {
>            std::cerr << "Could not create thread1" << std::endl ;
>        }
>        pthread_t           t2 ;
>        if ( pthread_create( &t2, NULL, &thread2, &lock ) != 0 ) {
>            std::cerr << "Could not create thread    }
>        pthread_join( t1, NULL ) ;
>        pthread_join( t2, NULL ) ;
>        return 0 ;
>    }
>
>Consider the following scenario:
>
> -- Thread 2 executes first.  It gets to the expression
>    g.begin() (which for purposes of argument, we will suppose
>    is called before g.end() -- the problem will occur in which
>    ever function is called first), and starts executing it.
>
>    At this point, the value _M_refcount in the _Rep_base is 0,
>    since there is only one instance, g, which shares the
>    representation.  The representation is not "leaked", so we
>    call _M_leak_hard.
>
>    _M_leak_hard calls _M_rep()->_M_is_shared(), which returns
>    false.
>
> -- Thread 1 interupts.  Thread 2 calls the copy constructor,
>    with g as a parameter, which ultimately calls _M_grab on the
>    _M_is_leaked() returns false, since the _M_refcount is still
>    0 in the representation.  Thread 2 thus calls _M_refcopy()
>    on the representation, which (atomically) increments
>    _M_refcount.  Thread 1 leaves the copy constructor.
>
> -- Now back to thread 2.   Since _M_is_shared() returned false,
>    thread 2 doesn't call _M_mutate -- is simply calls
>    _M_set_leaked() on the representation, which sets
>    _M_refcount to -1.
>
>    We will suppose that this modification is visible to all
>    other threads, despite the fact that there are no memory
>    barriers around it (which means that this supposition will
>    be false on certain platforms).
>
> -- And life goes on.  The second call to begin()/end() doesn't
>    change anything, because it finds that the representation is
>    already "leaked".
>
> -- Finally, suppose that thread 1 finishes while thread 1 is
>    still using its iterators.  Thread 1 calls the destructor
>    for its string.  It sees that _M_refcount < 0, concludes
>    that the representation is leaked, and deletes it.  Despite
>    the fact that thread 2 still holds iterators refering to it,
>    and despite the fact that there is still a global variable
>    (usable after the pthread_joins in main) which depends on
>    it.
>
>The problem is, of course, that the sequence which tests whether
>we have to leak, and then leaks, is not atomic.
>-- 
>           Summary: Lack of Posix compliant thread safety in
>                    std::basic_string
>           Product: gcc
>           Version: 3.4.3
>            Status: UNCONFIRMED
>          Severity: minor
>          Priority: P2
>         Component: libstdc++
>        AssignedTo: unassigned at gcc dot gnu dot org
>        ReportedBy: jkanze at cheuvreux dot com
>                CC: gcc-bugs at gcc dot gnu dot org
> GCC build triplet: sparc-sun-solaris2.8
>  GCC host triplet: sparc-sun-solaris2.8
>GCC target triplet: sparc-sun-solaris2.8
>
>
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334
Comment 12 Gabriel Dos Reis 2005-05-02 16:40:17 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"rittle at latour dot waar dot labs dot mot dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Subject: Re:  New: Lack of Posix compliant thread safety in
| 	std::basic_string
| 
| >I am sending this to the g++ bug list on the recommendation of
| >Gabriel Dos Reis.  From what little I've read in the g++
| >documentation, I'm not convinced that the authors of the g++
| >library intend for it to be supported, although Posix would seem
| >to require it.
| 
| Firstly, POSIX says nothing about C++

That statement is correct.  However, we have also taken step to tell
users to specify the threading model the compiler, therefore the
library, is to be built with.  Therefore, we have created
expecttations.  We can't reasonably go and tell users that C++
standard does not say anything about threading, so we would be right
to do whatever we like.  
Either we decide to ignore threading completely, or we take the more
useful apporach (as we've done so far) and take reports seriously.
You have a long standing record to take these issues seriously and I'm
glad you're weighting in.

| Secondly, it is clear that your bug report is hypothetical.  The
| library maintainers do not typically deal in hypotheticals.

thread-safety issues are hard to debug/reproduce, so some amounts of 
reasoning "in abstract" is necessary -- I would not go far as calling
it hypotheticals.  What I told James was that it would very much
appreciated if he could give a testcase that reproduces or is likely
to reproduce the problem -- acknowledging the fact that it may be hard
to get all the circumstances right simultaneously in order to trigger
the problem.

You probably got the mail about users saying that he read on C++
newsgroups that V3-s implementation of std::string is known to be
avoided in multi-threaded programs.  When I read similar claims from
James, I could not help but suggest that he make a PR if he found that
to be true.  Which is how we're here.

-- Gaby
Comment 13 Paolo Carlini 2005-05-02 17:01:32 UTC
> You probably got the mail about users saying that he read on C++
> newsgroups that V3-s implementation of std::string is known to be
> avoided in multi-threaded programs.

Whereas I'm all for providing alternate memory management policies (we are
very close to that in the v7-branch and I promise further progress during
the next months) I fail to properly appreciate the actual details of this issue:
which applications are actually penalized, which are the alternatives, something
more concrete, in other terms, that a vague reference to those "position" papers
that we all know, in other terms ;) (and one of those I'd like to discuss
in detail...) Maybe James can help here. In practice, we could, for instance,
*within the current ABI*, provide a switch to disable completely reference
counting, would that be appreciated? I can implement this very quickly but I'd
like to have some evidence that a non-trivial number of users would appreciate
that short-term solution. That would be indeed, a short term solution, because
a library relying purely on a non-refcounted string has subtle issues with
memory allocation during exceptions, that definitely we don't want in a long 
term solution: if nobody has got a better idea, I'm afraid we have to implement
also a separate ref-counted mini-string for usage in exceptions.
Comment 14 Gabriel Dos Reis 2005-05-02 17:17:28 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From pcarlini at suse dot de  2005-05-02 17:01 -------
| > You probably got the mail about users saying that he read on C++
| > newsgroups that V3-s implementation of std::string is known to be
| > avoided in multi-threaded programs.
| 
| Whereas I'm all for providing alternate memory management policies (we are

I don't quite understand why you want to see this issue as a memory 
management policy issue, as opposed to a thread safety issue.

| very close to that in the v7-branch and I promise further progress during
| the next months) I fail to properly appreciate the actual details of this issue:
| which applications are actually penalized, which are the alternatives, something
| more concrete, in other terms, that a vague reference to those "position" papers
| that we all know, in other terms ;) (and one of those I'd like to discuss
| in detail...) Maybe James can help here. In practice, we could, for instance,

I don't quite understand what "position papers" you're talking about.
Could you give references to them to clarify the issue?

| *within the current ABI*, provide a switch to disable completely reference
| counting, would that be appreciated? I can implement this very quickly but I'd
| like to have some evidence that a non-trivial number of users would appreciate
| that short-term solution. That would be indeed, a short term solution, because
| a library relying purely on a non-refcounted string has subtle issues with
| memory allocation during exceptions, that definitely we don't want in a long 
| term solution: if nobody has got a better idea, I'm afraid we have to implement
| also a separate ref-counted mini-string for usage in exceptions.

What are those subtle issues with memory management during exceptions?
As far as I can tell, V3 is the only one that insists on COW.  Why is
it that others don't have those "subtle issues"?

-- Gaby
Comment 15 Paolo Carlini 2005-05-02 17:23:18 UTC
> I don't quite understand why you want to see this issue as a memory 
> management policy issue, as opposed to a thread safety issue.

Whatever, after all that is the layer at which you actually do the low-level
memory management. That is the level at which you decide whether actually
freeing or allocating memory, no? But I don't care much about those names.

> I don't quite understand what "position papers" you're talking about.
> Could you give references to them to clarify the issue?

Gaby, of course Herb Sutter papers, also reprinted in his book.

> What are those subtle issues with memory management during exceptions?
> As far as I can tell, V3 is the only one that insists on COW.  Why is
> it that others don't have those "subtle issues"?

Because the best one, definitely have a separate mini-string of sort,
what do you think?
Comment 16 Paolo Carlini 2005-05-02 17:33:01 UTC
About the last issue, besided confidential information, there is this nice
public thread that nicely summarizes it:

  http://tinyurl.com/9o4oj
Comment 17 Gabriel Dos Reis 2005-05-02 18:45:00 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From pcarlini at suse dot de  2005-05-02 17:23 -------
| > I don't quite understand why you want to see this issue as a memory 
| > management policy issue, as opposed to a thread safety issue.
| 
| Whatever, after all that is the layer at which you actually do the low-level
| memory management. That is the level at which you decide whether actually
| freeing or allocating memory, no?

No.  You can have a thread safe or thread unsafe management.  Doing
things at low level should not imply being blind-sighted or losing
perspective. 

| But I don't care much about those names.
| 
| > I don't quite understand what "position papers" you're talking about.
| > Could you give references to them to clarify the issue?
| 
| Gaby, of course Herb Sutter papers, also reprinted in his book.

Ah, but if you said "Herb's paper", that would have rung a
bell. Saying "position paper" does not tell me much, except that he
who uses that term was being a bit condencending.

No, I'm not discussing a position paper.  I'm trying to make sure
that we do not hand wave PRs under the assumption that they would come
from "position papers". 

| > What are those subtle issues with memory management during exceptions?
| > As far as I can tell, V3 is the only one that insists on COW.  Why is
| > it that others don't have those "subtle issues"?
| 
| Because the best one, definitely have a separate mini-string of sort,
| what do you think?

You mean the trick used in VC implementation for example?

-- Gaby
Comment 18 James Kanze 2005-05-03 08:34:24 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> Isn't this a bug as opposed to "enhancement"?  Enhancement
|> suggests that the behaviour is basically correct, but could be
|> improved.

I could accept that.  The behavior *is* conforme with your
documentation, even if it isn't conforme with what Posix
normally requires.

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 19 James Kanze 2005-05-03 08:37:40 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> Does the C++ standard mention multithreading and Posix
|> threads? ;)

No, but the g++ installation procedures do.  According to the
installation procedured, with the options I gave on generation,
the compiler supports the Posix thread model.

Or are you suggesting that g++ should return to the pre-3.0
position that multithreaded programs don't exist, or aren't
interesting to g++ users.

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 20 James Kanze 2005-05-03 08:56:09 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> >I am sending this to the g++ bug list on the recommendation of
|> >Gabriel Dos Reis.  From what little I've read in the g++
|> >documentation, I'm not convinced that the authors of the g++
|> >library intend for it to be supported, although Posix would seem
|> >to require it.

|> Firstly, POSIX says nothing about C++ thus my confusion starts here.

The statement I quoted from the Posix standard is language
neutral.  A priori, it applies to all languages.

|> Secondly, it is clear that your bug report is hypothetical.  The
|> library maintainers do not typically deal in hypotheticals.

I guess it is really a question of whether you want quality or
not.  In general, all code is supposed incorrect until proven
otherwise -- in the case of threading issues, this is
particularly important, because of the random factors involved.

In this particular case, there is only a very, very small window
of time in which the error can occur.  I could add a lot of
extra code, which would only obfuscate the real problem, but
would increase the chances that the error occurs.  I could
probably, experimentally, add just the right amount so that the
error occurred more or less regularly (say one execution in
ten) on my machine.  Unless your machine had exactly the same
configuration (processor, clock speed, background processes,
etc.), the error would probably not trigger there.

|> >For the record, the statement in Posix is: "Applications shall
|> >ensure that access to any memory location by more than one
|> >thread of control (threads or processes) is restricted such that
|> >no thread of control can read or modify a memory location while
|> >another thread of control may be modifying it."  The obvious
|> >extension to C++ is that of replacing "memory location" with
|> >"object"; at the very least, of course, one can only require
|> >something of "memory locations" which the user sees, directly or
|> >indirectly.

|> OK.

|> >The statement in the libstdc++-v3 FAQ (question
|> >5.6) is: "All library objects are safe to use in a multithreaded
|> >program as long as each thread carefully locks out access by any
|> >other thread while it uses any object visible to another thread,
|> >i.e., treat library objects like any other shared resource. In
|> >general, this requirement includes both read and write access to
|> >objects; unless otherwise documented as safe, do not assume that
|> >two threads may access a shared standard library object at the
|> >same time."

|> OK, I seem to recall editing that statement at one point...

|> >A considerably weaker guarantee than what one
|> >normally expects under Posix.  (Note that the clause "like any
|> >other shared resource" is simply false for those of us used to
|> >the Posix model.  If I replace std::string with char[] in my
|> >code below, the behavior is perfectly defined under Posix.)

|> No, confusion.  We are guaranteeing that if you lock the visible
|> accesses to global or otherwise shared objects, that no POSIX
|> threading rule will be violated within our own library code.

Well, there's certainly some confusion, because Posix says one
thing, and your documentation says another.  When you say "treat
library objects like any other shared resources', you are
contradicting yourself for a system using the Posix model.

In the case of string, of course, the obvious "other' system
resource to compare it with is char[].  If you replace
std::string with char[], in my example program, the code is well
defined and guaranteed to work under Posix.

|> We are
|> saying that we guarantee that any internal shared objects (which may
|> not be visible to the user, thus not lockable by the user) will be
|> correctly handled per POSIX threading rules.

I know what you are guaranteeing (in the statement above).  It's
because of this statement (or a similar one elsewhere in the
documentation) that I hadn't sent in a bug report, but that I
did take the bother of warning people in various news groups
that in g++, std::string did not behave as one would normally
expect.  You can use it, but you have to take additional
precautions that Posix says shouldn't be necessary, and that
experienced Posix users don't expect to need.

An implementation, of course, is free to decide what it wants to
guarantee, and what it doesn't.  If it is decided, however, that
the Posix guarantees do not extend to the library, then it is
important to document this fact; i.e. to indicate somewhere that
it cannot be missed that configuring the compiler to support
pthreads does NOT mean what it seems to mean.

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 21 James Kanze 2005-05-03 09:09:01 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> Whereas I'm all for providing alternate memory management
|> policies (we are very close to that in the v7-branch and I
|> promise further progress during the next months) I fail to
|> properly appreciate the actual details of this issue: which
|> applications are actually penalized, which are the alternatives,
|> something more concrete, in other terms, that a vague reference
|> to those "position" papers that we all know, in other terms ;)
|> (and one of those I'd like to discuss in detail...) Maybe James
|> can help here.

I'm not sure what sort of help you are looking for.  I thought
that I very clearly pointed out the problem, and the point in
the code where it occured.

In typicaly code, the problem occurs mainly with configuration
variables; they are declared non-const, because they have to be
set on start up, either from the command line, or from a
configuration file, and then used as if they were const.
According to Posix, I should be able to access such variables
without additional, user defined synchronization.

Note that the problem can be very, very subtle, and that most of
the time, the program will work despite the problem -- this is
one of the most pernicious cases of undefined behavior.

|> In practice, we could, for instance, *within the
|> current ABI*, provide a switch to disable completely reference
|> counting, would that be appreciated?

It would help.  But an implementation designed with reference
counting in mind is likely to be rather slow when reference
counting is turned off.  IMHO, correct and slow is better than
the current situation, but not all compiler users agree.

|> I can implement this very
|> quickly but I'd like to have some evidence that a non-trivial
|> number of users would appreciate that short-term solution.  That
|> would be indeed, a short term solution, because a library
|> relying purely on a non-refcounted string has subtle issues with
|> memory allocation during exceptions, that definitely we don't
|> want in a long term solution: if nobody has got a better idea,
|> I'm afraid we have to implement also a separate ref-counted
|> mini-string for usage in exceptions.

Most other implementations I'm aware of don't use std::string in
exceptions.  That would seem to be the best solution.

Other than that, there is no perfect solution with regards to
exceptions.  If an exception requires resources, and they aren't
there, there's going to be a problem.  What happens if you can't
allocate the memory for the exception itself?

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 22 Paolo Carlini 2005-05-03 09:29:16 UTC
> I'm not sure what sort of help you are looking for.  I thought
> that I very clearly pointed out the problem, and the point in
> the code where it occured.

Ok, my message was not clear. I'm looking for help about the
*performance* issue, not the correctness issue. To your best
knowledge all those users that avoid v3 basic_string for MT
applications do that for performance or for correctness (wrt
the Posix issue that you are pointing out in detail)?? Reading
those papers that I mentioned before (+ another one on C/C++
Users Journal which exactly touches *your* issue) it's *not*
at all clear that the latter is the main issue, in the general
understanding.

> It would help.  But an implementation designed with reference
> counting in mind is likely to be rather slow when reference
> counting is turned off.  IMHO, correct and slow is better than
> the current situation, but not all compiler users agree.

Indeed. Of course I'm talking about a *switch* (off by default)
But, about the correctness point...

> Most other implementations I'm aware of don't use std::string in
> exceptions.  That would seem to be the best solution.

Ok, but I don't think we can also change that in the short term
and not affecting the ABI, we are definitely going to do that in
v7-branch and I would appreciate if you could advise about the
best solution among all those proposed (see the thread).

Returning to our issue, however, do you still believe that a
switch turning off RC would be useful if we don't change the
treatment of exceptions? I don't know, maybe we can have 
something simple for that within the present ABI, but I cannot
make promises and want to have an idea of the amount of work.
Comment 23 Paolo Carlini 2005-05-03 10:39:12 UTC
In exceptions, I'm tempted to use something very simple, a fixed-size buffer,
as in STLPort, but that is the typical change affecting the ABI :(
Comment 24 James Kanze 2005-05-03 10:59:17 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> > I'm not sure what sort of help you are looking for.  I thought
|> > that I very clearly pointed out the problem, and the point in
|> > the code where it occured.

|> Ok, my message was not clear. I'm looking for help about the
|> *performance* issue, not the correctness issue. To your best
|> knowledge all those users that avoid v3 basic_string for MT
|> applications do that for performance or for correctness (wrt
|> the Posix issue that you are pointing out in detail)??

To the best of my knowledge, not very many people avoid
basic_string.  Most people aren't that aware of threading issues
to begin with, and most people just use whatever is there, and
suppose that it meets whatever standards are usual for
thread-safety on the platform in question.  I only started being
more wary myself because on two occasions, I've had to "fix"
programs which didn't work, because they used the basic_string
which came with 2.95.2 -- it just didn't occur to the authors to
ask the question whether the library gave the Posix guarantees.

The situation with the current library is simple: there is a
specific sequence of events which will cause it to use deleted
memory, double delete, etc.  For that to occur, one thread must
access the same object via [], at(), begin() or end(), a second
thread must copy the object, and the second thread must
interrupt the first thread between the test whether the object
is shared, and the moment it is marked as leaked.  On a typical
machine, that last condition will only last a couple of machine
instructions, less than a microsecond.  Which means that most of
the time, even if the code is incorrect, it will seem to work.
And every once in a blue moon, the user will get an unexplicable
crash, that the developers cannot duplicate.

The problem can be avoided.  The easiest (and surest) way is by
synchronizing manually; only accessing the global objects to
copy them (using the copy constructor) should work as well.  (As
soon as the representation is shared, the code works.  At least
in practice on single processor machines; there are no memory
barriers around the non-modifying reads, nor around the write of
-1 to mark the representation as leaked, which means that some
updates may not be correctly seen by threads running on a
different processor.  Only using copies still works, however;
the thread which makes the copy will at least see its update,
which results in a reference count greater than 0, which is all
that is needed to trigger the deep copy before leaking.  Still,
it's pretty shaky, and I would have expected anyone familiar
with thread safety issues to have ensured that the normal memory
barriers were present.)

|> Reading
|> those papers that I mentioned before (+ another one on C/C++
|> Users Journal which exactly touches *your* issue) it's *not*
|> at all clear that the latter is the main issue, in the general
|> understanding.

I'm not too sure which papers you are referring to, even after
Herb Sutter's name was mentionned.  I do know that the one
article I have read by Herb which concerned thread safety was
full of errors, and resulted in a long discussion in
comp.lang.c++.moderated.

|> > It would help.  But an implementation designed with reference
|> > counting in mind is likely to be rather slow when reference
|> > counting is turned off.  IMHO, correct and slow is better than
|> > the current situation, but not all compiler users agree.

|> Indeed. Of course I'm talking about a *switch* (off by default)
|> But, about the correctness point...

|> > Most other implementations I'm aware of don't use std::string in
|> > exceptions.  That would seem to be the best solution.

|> Ok, but I don't think we can also change that in the short term
|> and not affecting the ABI, we are definitely going to do that in
|> v7-branch and I would appreciate if you could advise about the
|> best solution among all those proposed (see the thread).

|> Returning to our issue, however, do you still believe that a
|> switch turning off RC would be useful if we don't change the
|> treatment of exceptions? I don't know, maybe we can have
|> something simple for that within the present ABI, but I cannot
|> make promises and want to have an idea of the amount of work.

I'm not sure.  As I said, the biggest part of the problem is
that people aren't aware of the problem.  When it comes down to
it, it's not nice, the the necessary work-arounds exist.  If
there is a new implementation in the pipeline which doesn't have
the problem, then I'd probably settle for a couple of prominent
warnings, in bold face, in the documentation.  Even though users
who actually even open the documentation seem in a minority, I'd
say that the most important thing is to realize what the
situation is, and document it.  Thus, for example, in the
statement from your FAQ that I quoted, you don't say "like any
other resource", you say "unlike other resources, locking is
necessary even if no thread modifies", and you use something
(bold print, etc.) to draw attention to the fact that the
guarantees are less that what one normally expects.

Not that that will necessarily help; most of the users I know
don't even know that a documentation for g++ exists.  It's g++
--help to find out the flags, and basta (if that much).  But
there are limits to what you have to do to protect the users
from themselves.

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 25 Paolo Carlini 2005-05-03 11:14:53 UTC
Hi James, and thanks for your explanations. Indeed, maybe better concentrating
on the v7-branch + documentation updates: as I told you already the framework
is already there and I will add very soon a different policy for not-refcounted
string + "short string optimization". Remains open the exceptions issue but
after all, now that I studied some materil, seems manageable.

But I cannot really reply in detail to your last message, now, just some 
pointers:

> I'm not too sure which papers you are referring to, even after
> Herb Sutter's name was mentionned.

Simply the series reprinted in "More exceptional C++", in particular
appendixes A and B.

                                          I do know that the one
> article I have read by Herb which concerned thread safety was
> full of errors, and resulted in a long discussion in
> comp.lang.c++.moderated.

Ah, yes, thanks. I read part of that thread, but have to re-read
it: in any case, will be not terribly useful for the work ahead.
Comment 26 Andrew Pinski 2005-05-03 15:06:58 UTC
(In reply to comment #24)
> "This message, including any attachments may contain confidential and
> privileged material; it is intended only for the person to whom it is
...

Can you stop attaching this message to the email messages since it is wrong and not really valid for any 
open mailing list?
Comment 27 James Kanze 2005-05-03 15:57:02 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> > "This message, including any attachments may contain confidential and
|> > privileged material; it is intended only for the person to whom it is
|> ...

|> Can you stop attaching this message to the email messages since
|> it is wrong and not really valid for any open mailing list?

Regretfully no.  For reasons beyond my fathoming, we have to use
Lotus Notes on a Windows machine for all external email, and
they've set up the Notes server to add this trailer (which as
you correctly point out, doesn't make much sense in a lot of
contexts.)  It's particularly painful, because there is no
development environment on the Windows machine, so I have to ftp
any code samples to and from the Solaris boxes.

Next time, I'll wait until I'm at home to report an error; my
Linux box doesn't have any of these problems:-).  (To be fair,
Windows doesn't have to be this bad, either; I've worked in
places where it was actually usable.)

--
James Kanze


"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 28 rittle@latour.waar.labs.mot.com 2005-05-03 19:08:33 UTC
Subject: Re:  Lack of Posix compliant thread safety in
	std::basic_string

>|> Secondly, it is clear that your bug report is hypothetical.  The
>|> library maintainers do not typically deal in hypotheticals.

> I guess it is really a question of whether you want quality or
> not.  In general, all code is supposed incorrect until proven
> otherwise -- in the case of threading issues, this is
> particularly important, because of the random factors involved.

Of course, we want quality (and we want performance) which is why we
insist upon non-hypothetical bug reports (either way, I've already
explained what is wrong with your posted code w.r.t. the cited FAQ
entry) and why we insist that library users must write correct locking
sequences outside the context of the library.  As you well know, there
is nothing in POSIX which dictates that a library internalize the
locking vs. implying that the user code must hold a mutex lock.

You observe that we should make string<> act exactly like char[] from
the user's perspective.  Neither the cited FAQ (which I just reviewed
in detail, esp. the full text of section 5.6) nor POSIX modified as
you'd wish implies that at all.  char[] is a raw chunk of memory.  In
the cited FAQ, we explain that any method call on any shared object
must be mutex-protected unless we specially document the method or the
entire class.  We are stating that we do nothing inside the library
which would make it impossible for a user to write threaded
application with our library.  As I've already observed to you,
string<> may or may not be specially documented.  If it is, then you
need to cite that documentation in your bug report, not the general
FAQ section 5.6.  This is basically my point.  The view expressed in
the FAQ section 5.6 is entirely consistent with the POSIX view (as
conveyed to me by Butenhof's book).  POSIX doesn't care what portions
of the C or C++ application have to apply mutex locks to ensure it happens.

The prime issue is visibility of the object.  We take responsibility
for locking access to internally hidden shared objects whenever the
holding reference was not itself visibility shared in the application
code.  When the holding reference was visibility shared in the
application code, then the user is responsible for locking access.  In
your posted code, the user has direct visibility to the shared object
thus he must hold a lock under our general model (also the SGI STL
general model).  If the shared object was invisible to the user
(i.e. a shared cache not implied by the standard which is indirectly
accessed by two independently declared library objects) and we failed
to hold a mutex internal to the library, then I would be more
interested in your report...

Our general recommendations precisely covers this case.  I will go
farther: Any locking model under C++ which fails to account for object
model visibility will likely perform like a real dog.  Bow Wow.

BTW, I noticed that you had no response to my point that you are using
an incorrect locking idiom in your posted code w.r.t. our general FAQ.

If you would like to object that your posted code fails w.r.t. another
section of the FAQ which offers special guarantees, then I will likely
not complain if you close this PR and open a new one.

> Well, there's certainly some confusion, because Posix says one
> thing, and your documentation says another.

Huh?  POSIX dictates when code must do or not do something to ensure
that the threading model is sound.  We dictate when user code must do
something in line with the POSIX model.  We tell the user precisely
when they must account for locks when working with library objects.
Is it complex (in C++)?  Yes.

> When you say "treat library objects like any other shared
> resources', you are contradicting yourself for a system using the
> Posix model.

There is no contradiction.  We made it clear what you have to do when
you use the library to ensure that your entire program is thread-safe.
If you insist on cutting a corner, what can we do?

If you think other words would be better in our FAQ, then I will take
no offense as the last author of that section if you propose a change.

> [...] You can use it, but you have to take additional
> precautions that Posix says shouldn't be necessary, and that
> experienced Posix users don't expect to need.

Humm...  If you would like to fix the library to match your
expectations instead of crafting application-level code as we (and
SGI) suggest, then please be our guest.  Your patch will only be
accepted if you don't kill library performance.

> An implementation, of course, is free to decide what it wants to
> guarantee, and what it doesn't.  If it is decided, however, that
> the Posix guarantees do not extend to the library, then it is
> important to document this fact; i.e. to indicate somewhere that
> it cannot be missed that configuring the compiler to support
> pthreads does NOT mean what it seems to mean.

I think our guarantee is now well-qualified and surely better than
what was stated pre-gcc 3.0.  I think you actually don't like our
qualification.

I think I can safely assure you that if you construct your C++ code
precisely as we document in section 5.6, you will never violate a
requirement of POSIX nor cause the library itself internally to
violate a requirement of POSIX.  If you cut a corner, what can we do?

Regards,
Loren
Comment 29 James Kanze 2005-05-04 09:14:05 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> >|> Secondly, it is clear that your bug report is hypothetical.  The
|> >|> library maintainers do not typically deal in hypotheticals.

|> > I guess it is really a question of whether you want quality or
|> > not.  In general, all code is supposed incorrect until proven
|> > otherwise -- in the case of threading issues, this is
|> > particularly important, because of the random factors involved.

|> Of course, we want quality (and we want performance) which is
|> why we insist upon non-hypothetical bug reports (either way,
|> I've already explained what is wrong with your posted code
|> w.r.t. the cited FAQ entry)

I know what was wrong with my code with regards to the FAQ.  I
know that the code didn't conform to *your* threading model.  My
point is that your threading model is, in a certain sense,
wrong.  It isn't what people expect, at least not people who are
familiar with the Posix threading model.

The documentation is in fact very misleading, with phrases like
"like for any other shared resource", when you don't need
external synchronization for other shared resources, *unless*
someone is actively modifying it.

Now, I'm not saying that the Posix model is the only legitimate
one.  But it is the one I, and most people working on Unix
platforms, are familiar with, and expect.  And it is a standard
on those platforms.  You can choose to ignore it, and to ignore
what SGI has done in this regard.  But in that case, you have to
be aware that you are trying to create an island, in which you
are different from everyone else.

|> and why we insist that library users
|> must write correct locking sequences outside the context of the
|> library.  As you well know, there is nothing in POSIX which
|> dictates that a library internalize the locking vs. implying
|> that the user code must hold a mutex lock.

Posix only defines its own functions (most of which are either
thread-safe, or have a thread-safe variant), and it only defines
a binding for C for them.  Other than that, we are left with
general, language neutral statements like the one I quoted.

Obviously, a library not specified by Posix can do what ever it
wants.  In practice, however, user expectations are based on
what Posix does, and that leaves two alteratives:

 -- Model your library on the actual Posix functions, with
    full internal locking -- I can call write on the same file
    descriptor from two different threads, for example, without
    problems.

    A lot of people (although none of the experts I know) seem
    to expect this.  Rogue Wave tries to do it, for example.
    IMHO, however, it isn't reasonable for a library which lets
    references (in the general sense) escape to the user --
    Posix itself makes an exception for functions which return
    pointers to internal data structures.  (It doesn't use those
    words, but if you look at the list of functions which are
    not guaranteed thread-safe, it includes exactly those
    functions, and no others.)

 -- Base your model on the general, language neutral statements.
    This is what SGI does, for example.  The general statement
    is that I need a lock if more than one thread is accessing
    the "memory", and any thread is modifying it.  Since the
    abstraction model of C++ prevents me from actually seeing
    the memory, one is lead to exterpolate "memory" to "object".
    This seems a reasonable interpretation to me.  At least some
    experts agree -- the authors of the the SGI threading model,
    for example.  (I consider the people who were working at SGI
    when that model was defined to be among the best threading
    experts in the world.  That's just my opinion, but every
    time I've had a discussion with them, I've come out knowing
    more than when I went in.)

You can, obviously, take the position that you have defined the
threading model you wish to use, that you know better than
everyone else, and that your decisions cannot be questioned.
And I can take the position that you're just wrong, you don't
know anything about threading, that g++ is worthless in a
multithreaded environment, and publicize that opinion in various
newsgroups.  Such attitudes won't help either of us, however.

This bug report came about because of a discussion in a news
group.  Basically, I said to watch out for std::string with g++
if you are in a multithreaded environment.  I don't remember my
exact words, but I'm pretty sure that the gist would have been
that the g++ implementation of std::string does not behave as
one might expect.  I said it in a newsgroup, rather than making
a bug report, because I knew of the text in the FAQ (or
something similar), and I was convinced that no one here would
consider it an error.  Gaby suggested otherwise; that if I could
describe a case where the code could fail, although no thread
modified the string, I should report it as a bug.  So we're
here, and I'm getting hounded because my email contains trailers
which I can do nothing about:-).  I was inordinately pleased by
his suggestion -- I sometimes get the impression that the g++
developers live in a world of their own, and do not care about
the expectations of "just users", and was particularly pleased
to find that my impression might be wrong.

I'd like to use g++ professionally.  Most of my development is
on Sparcs, under Solaris, and the "offical" compiler is Sun CC.
Which, quite frankly, isn't supported very well.  But how can I
explain to my collegues that they have to deal with two
different threading models, depending on where the library came
from?  Whatever the theoretical issues, practical considerations
suggest that you align yourself with the conventions of the
platform whenever possible.

|> You observe that we should make string<> act exactly like char[]
|> from the user's perspective.

Not necessarily exactly.  I'm looking for a model to define
reasonable expectations.  If we claim that users should use
std::string instead of char[], we shouldn't make it more
difficult to do so, at least not for the common cases.

|> Neither the cited FAQ (which I
|> just reviewed in detail, esp. the full text of section 5.6) nor
|> POSIX modified as you'd wish implies that at all.  char[] is a
|> raw chunk of memory.  In the cited FAQ, we explain that any
|> method call on any shared object must be mutex-protected unless
|> we specially document the method or the entire class.  We are
|> stating that we do nothing inside the library which would make
|> it impossible for a user to write threaded application with our
|> library.  As I've already observed to you, string<> may or may
|> not be specially documented.  If it is, then you need to cite
|> that documentation in your bug report, not the general FAQ
|> section 5.6.  This is basically my point.  The view expressed in
|> the FAQ section 5.6 is entirely consistent with the POSIX view
|> (as conveyed to me by Butenhof's book).  POSIX doesn't care what
|> portions of the C or C++ application have to apply mutex locks
|> to ensure it happens.

|> The prime issue is visibility of the object.  We take
|> responsibility for locking access to internally hidden shared
|> objects whenever the holding reference was not itself visibility
|> shared in the application code.  When the holding reference was
|> visibility shared in the application code, then the user is
|> responsible for locking access.

This is exactly my point.  The only visible parts of std::string
are the individual char's (which are, by the way, "memory
locations", as per the Posix definition).  As long as I do
nothing which modifies the visible parts of an std::string
object, I should be able to access it from multiple threads,
without external locking.  The fact that there are, internally,
other memory locations which may be modified, should be
transparent to me.

|> In your posted code, the user
|> has direct visibility to the shared object thus he must hold a
|> lock under our general model (also the SGI STL general model).

From the SGI document: "The SGI implementation of STL is
thread-safe only in the sense that simultaneous accesses to
distinct containers are safe, and simultaneous read accesses to
to shared containers are safe."

That sounds pretty clear to me: simultaneous read accesses to
shared containers are safe.  That's exactly what I do in my
program: two threads access simultaneously a shared object,
without modifying it.

|> If the shared object was invisible to the user (i.e. a shared
|> cache not implied by the standard which is indirectly accessed
|> by two independently declared library objects) and we failed to
|> hold a mutex internal to the library, then I would be more
|> interested in your report...

|> Our general recommendations precisely covers this case.  I will
|> go farther: Any locking model under C++ which fails to account
|> for object model visibility will likely perform like a real dog.
|> Bow Wow.

I've not noticed performance problems in the SGI
implementations.  Which do give the guarantee I expect.

|> BTW, I noticed that you had no response to my point that you are
|> using an incorrect locking idiom in your posted code w.r.t. our
|> general FAQ.

Because I've never disputed the fact.  The FAQ is highly
misleading, but if you ignore the extraeneous comments, and
restrict yourself to what is actually guaranteed, you find that
it is NOT what is usually guaranteed under Posix, despite claims
to the contrary.

|> If you would like to object that your posted code fails w.r.t.
|> another section of the FAQ which offers special guarantees, then
|> I will likely not complain if you close this PR and open a new
|> one.

|> > Well, there's certainly some confusion, because Posix says one
|> > thing, and your documentation says another.

|> Huh?  POSIX dictates when code must do or not do something to
|> ensure that the threading model is sound.

Right.  It says that if more than one thread is accessing a
memory location, and one or more threads modifies that memor
location, then no external synchronization is necessary.  In my
example, nobody modified any memory locations which were visible
to the user, so according to Posix, no external synchronization
should be necessary.

|> We dictate when user
|> code must do something in line with the POSIX model.

You dictate something more.  If I am to apply the Posix model to
your implementation of std::string, then I must know something
about the internal details to know that a memory location is
being modified, even if that location is not visible to me.

|> We tell
|> the user precisely when they must account for locks when working
|> with library objects.  Is it complex (in C++)?  Yes.

The problem isn't the complexity; writing correct multi-threaded
applications requires some thought, regardless of the model.
The problem is having to deal with a different model for certain
types of objects.

|> > When you say "treat library objects like any other shared
|> > resources', you are contradicting yourself for a system using the
|> > Posix model.

|> There is no contradiction.

Posix says that I do not have to use external synchronization if
I do not modify the objects.  You say I have to.  That is a
contradiction.

|> We made it clear what you have to do
|> when you use the library to ensure that your entire program is
|> thread-safe.  If you insist on cutting a corner, what can we do?

|> If you think other words would be better in our FAQ, then I will
|> take no offense as the last author of that section if you
|> propose a change.

Well, I did suggest that if you can't change the code
immediately, the text should be something along the lines
"unlike the normal case under Posix, you must use external
synchronization even when not modifying objects".

|> > [...] You can use it, but you have to take additional
|> > precautions that Posix says shouldn't be necessary, and that
|> > experienced Posix users don't expect to need.

|> Humm...  If you would like to fix the library to match your
|> expectations instead of crafting application-level code as we
|> (and SGI) suggest, then please be our guest.  Your patch will
|> only be accepted if you don't kill library performance.

My code is conform to the SGI requirements.  And in fact, if I
replace your std::string with any of the SGI containers, it
works.

|> > An implementation, of course, is free to decide what it wants to
|> > guarantee, and what it doesn't.  If it is decided, however, that
|> > the Posix guarantees do not extend to the library, then it is
|> > important to document this fact; i.e. to indicate somewhere that
|> > it cannot be missed that configuring the compiler to support
|> > pthreads does NOT mean what it seems to mean.

|> I think our guarantee is now well-qualified and surely better
|> than what was stated pre-gcc 3.0.  I think you actually don't
|> like our qualification.

It is, of course, senseless to compare with pre-gcc 3.0.  Before
3.0, there was no effort made to be thread safe, and the
compiler simply couldn't be used in a multithreaded application.
(I know.  Twice, I've had to pick up the pieces.)

|> I think I can safely assure you that if you construct your C++
|> code precisely as we document in section 5.6, you will never
|> violate a requirement of POSIX nor cause the library itself
|> internally to violate a requirement of POSIX.  If you cut a
|> corner, what can we do?

If I conform to your requirements, obviously, I will conform to
those of Posix, since yours are more strict.  The reverse is not
true, and that is, in some contexts, a problem.

If the documentation is reworded to stress the difference, and
the fact that there is an unexpected additional restriction, I
can live with it -- the problem only occurs in corner cases, and
is fairly easy to work around, IF one is aware of it.

(There is a more general problem, in that the documentation is
not always easy to find.  This is at least partically true for
almost all complers, however, and I suspect that many don't even
mention what they do or don't guarantee in this regard.  But
finding particular information in the gcc documentation seems
particularly difficult.  There is also an even more general
problem that people don't read the documentation, even when they
find it, but I don't think there is anything you can do about
that.)

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 30 Gabriel Dos Reis 2005-05-04 12:12:52 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

"jkanze at cheuvreux dot com" <gcc-bugzilla@gcc.gnu.org> writes:


[...]

| This bug report came about because of a discussion in a news
| group.  Basically, I said to watch out for std::string with g++
| if you are in a multithreaded environment.  I don't remember my
| exact words, but I'm pretty sure that the gist would have been
| that the g++ implementation of std::string does not behave as
| one might expect.  I said it in a newsgroup, rather than making
| a bug report, because I knew of the text in the FAQ (or
| something similar), and I was convinced that no one here would
| consider it an error.  Gaby suggested otherwise; that if I could
| describe a case where the code could fail, although no thread
| modified the string, I should report it as a bug.  So we're
| here, and I'm getting hounded because my email contains trailers
| which I can do nothing about:-).  I was inordinately pleased by

James --
  I think you can abstract over the remark about your mailer, however
annoying it is :-)

As of the model behind the implementation of std::string, I do not
think we have reached an agreement within the maintainers that it is
actually right and your description is wrong.  I'm of the opinion 
that we're reached a point where it causes so much conofusion that
there must be something wrong with it.  Having listening to either
sides and looking at the codes, I'm inclined to consider your report
valid -- however "hypothetical" my fellow maintainers may consider
it :-)  Concurrent read accesses should be guaranted, following POSIX.
This is not because you're threatening to woersen g++'s reputation
or disregard it in production use -- if it were that, I would not
worry ;-).  This is because no matter what we might say in our FAQ,
if the behaviour is odd enough or singular enough, it entails Fear,
Uncertainty and Doubt.  That is not what we want.

-- Gaby
Comment 31 James Kanze 2005-05-04 12:46:14 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string


|> [...]

|> | This bug report came about because of a discussion in a news
|> | group.  Basically, I said to watch out for std::string with g++
|> | if you are in a multithreaded environment.  I don't remember my
|> | exact words, but I'm pretty sure that the gist would have been
|> | that the g++ implementation of std::string does not behave as
|> | one might expect.  I said it in a newsgroup, rather than making
|> | a bug report, because I knew of the text in the FAQ (or
|> | something similar), and I was convinced that no one here would
|> | consider it an error.  Gaby suggested otherwise; that if I could
|> | describe a case where the code could fail, although no thread
|> | modified the string, I should report it as a bug.  So we're
|> | here, and I'm getting hounded because my email contains trailers
|> | which I can do nothing about:-).  I was inordinately pleased by

|> James --
|>   I think you can abstract over the remark about your mailer, however
|> annoying it is :-)

I did put a smiley behind the complaint.  I can imagine that all
that garbage that gets added is as annoying to others as my
being told to do something about it is to me.  All I can say is
that it annoys me too, and that I"d get rid of it if I could.

|> As of the model behind the implementation of std::string, I do not
|> think we have reached an agreement within the maintainers that it is
|> actually right and your description is wrong.  I'm of the opinion
|> that we're reached a point where it causes so much conofusion that
|> there must be something wrong with it.  Having listening to either
|> sides and looking at the codes, I'm inclined to consider your report
|> valid -- however "hypothetical" my fellow maintainers may consider
|> it :-)  Concurrent read accesses should be guaranted, following POSIX.

All I really ask is that the problem be recognized.  As I said,
it only affects certain specific cases, and there are acceptable
work arounds.  If the doc says explicitly, we're doing something
different here, I can live with that -- I would suppose that
there were some good technical reasons (or possibly
non-technical -- the world isn't 100% controlled by technology
yet) for it.

Responses along the lines of "it's only hypothetical", or
"you're not playing the game according to our rules", are not
helpful.  They're not helpful because I know that already.

The window of vulnerability is exceedingly small, which means
that creating a test case which reliably fails is exceedingly
difficult.  From your comments in the newsgroup, I was fairly
sure that you would accept my next best approach.  From what I
know of the extremely high technical competence of the other g++
developers, including the original author of std::string, I
rather thought that they would, too.  I'll admit that the
"hypothetical" bit surprised me.

The fact that I am not playing according to the official rules
is another issue -- I know that, too, but I feel 1) that the
official rules aren't really the best with regards to usability,
and above all, 2) the way they are formulated is somewhat
"ingenious".  Point 1 is, of course, personal opinion, but the
fact that some experts are also led astray by the formulation is
IMHO worth considering as a problem in itself.

|> This is not because you're threatening to woersen g++'s reputation
|> or disregard it in production use -- if it were that, I would not
|> worry ;-).

That's because you know that the competition is Sun CC (which is
no competition at all), and that I don't get to choose the
compiler anyway:-).

Seriously, however, I think you understand why I might feel
obligated in certain contexts to mention the fact that g++ does
have problems with regards to threading.  Not problems like pre
3.0 had, of course, but the fact is that it doesn't always
conform to the local conventions.  Sometimes, for some very good
reasons, but differences have to be doubly documented, because
people don't expect them.

|> This is because no matter what we might say in our FAQ,
|> if the behaviour is odd enough or singular enough, it entails Fear,
|> Uncertainty and Doubt.  That is not what we want.

I think that's the key.  People should feel at ease with the
tool.  Globally, I like g++ -- it's become my preferred compiler
for most things.  Which just makes me more upset about any
weaknesses.  (Like the absence of export, for example:-).)

--
James Kanze

"This message, including any attachments may contain confidential and
privileged material; it is intended only for the person to whom it is
addressed. Its contents do not constitute a commitment by Credit Agricole
Cheuvreux except where provided for in a written agreement. Credit Agricole
Cheuvreux assumes no liability or responsibility for the consequences
arising out of a delay and/or loss in transit of this message, or for
corruption or other error(s) arising in its transmission and for any misuse
or fraudulent use which may be made thereof. If you are not the intended
recipient, please contact us and abstain from any disclosure, use or
dissemination. To the extent that this message contains research
information and/or recommendations, these are provided on the same basis as
Credit Agricole Cheuvreux's published research and the recipient must have
regard to all disclosures and disclaimers contained therein."




Comment 32 dank 2005-06-03 23:37:45 UTC
I'm tempted to start a new PR with summary "std::string slow in multithreaded
programs due to COW" so we can focus on the quality of implementation
issue, and leave aside the question of correctness.

James, could you please participate in these discussions by using
the web interface (e.g. http://gcc.gnu.org/PR21334)?  That would
avoid the boilerplate disclaimer that gets appended to each one
of your emails.  Thanks!
Comment 33 Paolo Carlini 2005-06-03 23:51:47 UTC
Certainly, numbers from actual benchmarks would be useful. In order to make these
comparisons easier, next weeks I will add to the v7-branch basic_string an
alternate base-class implementation which avoids reference-counting (+ simple
short-string optimization and needed changes to strings in exceptions classes).
Will be useful also to start evaluating various options for forms of simulated
move-semantics that should extend somewhat the range of application also to 
non-MT. All-in-all I would probably advice against the idea of opening a separate
PR - the performance issue is *well* known, in the abstract.
Comment 34 Eric Botcazou 2005-10-22 08:24:44 UTC
Not SPARC/Solaris specific.
Comment 35 gianni 2007-05-07 22:08:04 UTC
Is this really a bug ?  Any discussion in the upcoming standardization of threads that talks about calling a non-const begin() on a std::string ? 

If we take James's code and replace the "g" definition like so:

std::string         g_modifyable ;
const std::string & g = g_modifyable;

... there is no race condition.

Who said that calling begin() on a non const std::string should be thread safe ?
Comment 36 gianni 2007-05-08 06:24:34 UTC
BTW - you don't need a multi-threaded test to make this problem show up.

The code below is plain old C++ and breaks.  Again, I hesitate in calling this one a bug because begin() on a non-const object make be "allowed" to invalidate previous const iterators, I'm not sure.  If it is not allowed to then this is a legitimate bug - no threads needed.  BTW, this test works on :

"Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42"

#include <string>
#include <cassert>

std::string str( 1<<20, 'a' );

void foo()
{

    std::string str2( str );

    const std::string & rstr2 = str2;

//     str2.begin();

    std::string::const_iterator bc = rstr2.begin();
    std::string::iterator bnc = str2.begin();

    std::string::const_iterator ec = rstr2.end();
    std::string::iterator enc = str2.end();

    std::string r1( bc, ec );
    std::string r2( bnc, enc );

    assert( r1 == r2 );

    assert( r1 == str );
}

int main()
{
    foo();
}
Comment 37 James Kanze 2007-05-08 16:11:02 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

On 7 May 2007 21:08:05 -0000, gianni at mariani dot ws
<gcc-bugzilla@gcc.gnu.org> wrote:

> ------- Comment #35 from gianni at mariani dot ws  2007-05-07 22:08 -------
> Is this really a bug ?  Any discussion in the upcoming standardization of
> threads that talks about calling a non-const begin() on a std::string ?

> If we take James's code and replace the "g" definition like so:

> std::string         g_modifyable ;
> const std::string & g = g_modifyable;

> ... there is no race condition.

> Who said that calling begin() on a non const std::string
> should be thread safe ?

Posix and common sense.  Just as using a char* (rather than a
char const*) to access a char[] is thread safe.

But let's not start that again.  The ISO committee is discussing
the issue, and will doubtlessly decide one way or another.
(Before they get that far, they have to agree on what a thread
means:-).)  In the meantime, all of the other implementations
work (but have other disadvantages); the g++ implementation
doesn't.

Comment 38 James Kanze 2007-05-08 16:21:45 UTC
Subject: Re:  Lack of Posix compliant thread safety in std::basic_string

On 8 May 2007 05:24:35 -0000, gianni at mariani dot ws
<gcc-bugzilla@gcc.gnu.org> wrote:

> ------- Comment #36 from gianni at mariani dot ws  2007-05-08 06:24 -------

> BTW - you don't need a multi-threaded test to make this problem show up.

> The code below is plain old C++ and breaks.  Again, I hesitate
> in calling this one a bug because begin() on a non-const
> object make be "allowed" to invalidate previous const
> iterators, I'm not sure.

In this case, the standard specifically says that the code is
invalid.  (For all of the standard containers, the standard
specifies what operations invalidate iterators.)

> If it is not allowed to then this is
> a legitimate bug - no threads needed.  BTW, this test works on:

> "Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42"

As does my example:-).  Different implementations will behave
differently.

Comment 39 Paolo Carlini 2007-05-08 19:50:23 UTC
The proper status of this PR is SUSPENDED. Today, mid of 2007, we all more or less concur that <string> is better implemented without reference-counting, optimized for short strings, and, of course, exploiting rvalue references. Indeed, we are already providing the first two features in ext/vstring.h, and the latter will be added ASAP. As for changing <string> itself, of course we have to wait for the first ABI break.
Comment 40 gianni 2007-05-09 01:54:44 UTC
Paolo writes:
> ... concur that <string> is better implemented without reference-counting ...

Could I ask you to enumerate the reasons why you come to this conclusion ?  I just want understand better why (royal) we came to this conclusion.

Comment 41 Andrew Pinski 2007-06-14 01:27:59 UTC
*** Bug 32261 has been marked as a duplicate of this bug. ***
Comment 42 tlknv 2012-05-30 18:41:15 UTC
I would like to propose a patch that should solve this problem.
How can I propose the patch?
Comment 43 Marc Glisse 2012-05-30 19:38:12 UTC
(In reply to comment #42)
> I would like to propose a patch that should solve this problem.

Note that there are plans to ditch the current implementation of basic_string in the future (4.9?), replacing it with the code in ext/vstring.h.

> How can I propose the patch?

http://gcc.gnu.org/contribute.html

(in short:
* sign the contributor agreement if your patch is more than a couple lines
* post your patch to the gcc-patches *and* libstdc++ mailing lists)
Comment 44 tlknv 2012-05-31 14:43:14 UTC
Thanks Marc.
I have posted my patch at http://gcc.gnu.org/ml/gcc-patches/2012-05/msg02086.html
This is essentially a couple of line change so I hope I don't need to find and sign the contributor agreement.
Comment 45 Jonathan Wakely 2012-05-31 15:54:05 UTC
Thanks for the patch, I'll take a look asap.

Just to answer this older comment ...

(In reply to comment #35)
> Who said that calling begin() on a non const std::string should be thread safe
> ?

For the record, as it hasn't been mentioned here explicitly, C++11 says it:

23.2.2 Container data races [container.requirements.dataraces]
-1- For purposes of avoiding data races (17.6.5.9), implementations shall consider the following functions to be const: begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range, at and, except in associative or unordered associative containers, operator[].

So in C++11 this is definitely a bug (but then a COW string is non-conforming in other ways too, which is why it will go away when the ABI is changed.)
Comment 46 tlknv 2012-05-31 16:13:15 UTC
Thanks Jonathan.
I didn't know about the new 23.2.2 requirements.
> but then a COW string is non-conforming in other ways too
Which ways? I know that I should read the standard but you might have a quick answer.
Comment 47 Jonathan Wakely 2012-05-31 16:42:27 UTC
21.4.1 [string.require] says that the non-const forms of operator[], at, front, back, begin, rbegin, end and rend may not invalidate references, pointers and iterators (so must not reallocate or modify the string)

This example shows that requirement is not met:

    std::string s("str");
    const char* p = s.data();
    {
        std::string s2(s);
        (void) s[0];
    }
    std::cout << *p << '\n';  // p is dangling

Also the copy constructor requirements in Table 64 require the new object to have a copy of the data.  I think there are other reasons too.
Comment 48 Patrick J. LoPresti 2014-07-08 00:13:53 UTC
Was this ever fixed? I do not see any mention of it in https://gcc.gnu.org/gcc-4.9/changes.html nor 4.8/changes.html nor 4.7/changes.html nor...

In any event, "suspended" seems like the wrong status for this report, since it is unambiguously a bug with respect to C++11 (?)
Comment 49 Tim O'Neil 2014-11-09 19:15:31 UTC
In the hope this will help, I try to stay pretty current:
---
$ clang++ --version
Ubuntu clang version 3.5-1ubuntu1 (trunk) (based on LLVM 3.5)
Target: x86_64-pc-linux-gnu
Thread model: posix
$ clang++ -Wall -std=c++11 bugtest.c -o bugtest
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated
$ ./bugtest
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_S_create
Aborted
[Exit 134 SIGABRT]
$ clang++ -Wall -std=c++11 -stdlib=libc++ bugtest.c -o bugtest
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated
$ ./bugtest
$ g++ --version
g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++ -Wall -std=c++11 bugtest.c -o bugtest
$ ./bugtest
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_S_create
Aborted
[Exit 134 SIGABRT]
$
Comment 50 Tim O'Neil 2014-11-09 19:18:44 UTC
(In reply to Tim O'Neil from comment #49)
> In the hope this will help, I try to stay pretty current:

Oh, this is using the code James posted on 2005-05-02 above.
Comment 51 Jonathan Wakely 2015-03-23 13:01:23 UTC
This is no longer an issue when using the new non-reference-counted std::string implementation in GCC 5.