Bug 40518 - data races when calling std::string::erase() on empty string
data races when calling std::string::erase() on empty string
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: libstdc++
4.3.2
: P3 normal
: 4.4.4
Assigned To: Not yet assigned to anyone
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-22 10:27 UTC by Jonathan Wakely
Modified: 2014-02-16 13:12 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 4.2.2 4.3.2 4.4.3
Last reconfirmed: 2009-06-22 11:03:37


Attachments
Draft (891 bytes, patch)
2009-06-22 11:06 UTC, Paolo Carlini
Details | Diff
Updated (2.58 KB, patch)
2009-06-22 17:56 UTC, Paolo Carlini
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Wakely 2009-06-22 10:27:24 UTC
from http://etbe.coker.com.au/2009/06/22/valgrindhelgrind-and-stl-string

simplified test case:

#include <pthread.h>
#include <string>


void *do_work(void *)
{
    std::string s;
    s.erase();
    return 0;
}

int main()
{
    pthread_t tid[2];
    pthread_create(&tid[0], NULL, do_work, NULL);
    pthread_create(&tid[1], NULL, do_work, NULL);
    pthread_join(tid[0], NULL);
    pthread_join(tid[1], NULL);

    return 0;
}

Helgrind shows data races, because calling erase() on an empty string calls _M_set_length_and_sharable() on the shared empty _Rep object, which assigns zero to the refcount, length, and refdata[0].

Those locations contain zero already, so as long as the writes are atomic then no values are modified, but it is a race.

$ valgrind --tool=helgrind ./a.out
==17699== Helgrind, a thread error detector.
==17699== Copyright (C) 2007-2008, and GNU GPL'd, by OpenWorks LLP et al.
==17699== Using LibVEX rev 1878, a library for dynamic binary translation.
==17699== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==17699== Using valgrind-3.4.0, a dynamic binary instrumentation framework.
==17699== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==17699== For more details, rerun with: -v
==17699==
==17699== Thread #3 was created
==17699==    at 0x8EDC38: clone (in /lib/libc-2.5.so)
==17699==    by 0x4AA960F: pthread_create@* (hg_intercepts.c:214)
==17699==    by 0x804867D: main (etbe2.cc:16)
==17699==
==17699== Thread #2 was created
==17699==    at 0x8EDC38: clone (in /lib/libc-2.5.so)
==17699==    by 0x4AA960F: pthread_create@* (hg_intercepts.c:214)
==17699==    by 0x8048657: main (etbe2.cc:15)
==17699==
==17699== Possible data race during read of size 4 at 0x607d688 by thread #3
==17699==    at 0x602C162: std::string::erase(unsigned int, unsigned int) (basic_string.h:607)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 4 by thread #2
==17699==    at 0x602B99F: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:207)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== Possible data race during read of size 4 at 0x607d688 by thread #3
==17699==    at 0x602B915: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:607)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 4 by thread #2
==17699==    at 0x602B99F: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:207)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== Possible data race during read of size 4 at 0x607d690 by thread #3
==17699==    at 0x602B9B8: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.tcc:450)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 4 by thread #2
==17699==    at 0x602B998: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:201)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== Possible data race during write of size 4 at 0x607d690 by thread #3
==17699==    at 0x602B998: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:201)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 4 by thread #2
==17699==    at 0x602B998: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:201)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== Possible data race during write of size 4 at 0x607d688 by thread #3
==17699==    at 0x602B99F: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:207)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 4 by thread #2
==17699==    at 0x602B99F: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (basic_string.h:207)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== Possible data race during write of size 1 at 0x607d694 by thread #3
==17699==    at 0x602B9A1: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (char_traits.h:246)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==  This conflicts with a previous write of size 1 by thread #2
==17699==    at 0x602B9A1: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (char_traits.h:246)
==17699==    by 0x602C187: std::string::erase(unsigned int, unsigned int) (basic_string.h:1133)
==17699==    by 0x80486DF: do_work(void*) (etbe2.cc:8)
==17699==    by 0x4AA970B: mythread_wrapper (hg_intercepts.c:194)
==17699==    by 0x96745A: start_thread (in /lib/libpthread-2.5.so)
==17699==    by 0x8EDC4D: clone (in /lib/libc-2.5.so)
==17699==
==17699== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 8 from 2)


The storage is written to, even if not modified, so this comment may not hold true:

        static _Rep&
        _S_empty_rep()
        { 
	  // NB: Mild hack to avoid strict-aliasing warnings.  Note that
	  // _S_empty_rep_storage is never modified and the punning should
	  // be reasonably safe in this case.
	  void* __p = reinterpret_cast<void*>(&_S_empty_rep_storage);
	  return *reinterpret_cast<_Rep*>(__p);
	}

As expected, the helgrind errors are not present with --enable-fully-dynamic-string
Comment 1 Paolo Carlini 2009-06-22 11:05:53 UTC
Jon, thanks for clearly pointing out that warning: indeed, at that time I noticed that in some cases we could overwrite the same values. Anyway, the very simple attached draft seems then the way to go. Can you test it? Thanks.
Comment 2 Paolo Carlini 2009-06-22 11:06:22 UTC
Created attachment 18046 [details]
Draft
Comment 3 Jonathan Wakely 2009-06-22 11:58:25 UTC
That looks good.  I didn't run the v3 testsuite, but it fixes the helgrind errors in the test cases
Comment 4 Paolo Carlini 2009-06-22 12:01:50 UTC
Great. In the meanwhile I did run the testsuite and everything is fine. Let's wait a bit more, in case something trickier is noticed, and close the issue.
Comment 5 Paolo Carlini 2009-06-22 17:55:56 UTC
Jo, I think we need an additional hunk, to deal with s.erase(s.begin(), s.end()). I'm attaching an updated draft...
Comment 6 Paolo Carlini 2009-06-22 17:56:30 UTC
Created attachment 18048 [details]
Updated
Comment 7 Paolo Carlini 2009-06-22 17:56:54 UTC
Jon, that is ;)
Comment 8 Jonathan Wakely 2009-06-22 20:30:15 UTC
the revised patch tests ok, no helgrind errors from

std::string s;
s.erase();
s.erase(s.begin(), s.end());
Comment 9 paolo@gcc.gnu.org 2009-06-23 12:36:56 UTC
Subject: Bug 40518

Author: paolo
Date: Tue Jun 23 12:36:43 2009
New Revision: 148850

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=148850
Log:
2009-06-23  Paolo Carlini  <paolo.carlini@oracle.com>

	PR libstdc++/40518
	* include/bits/basic_string.h (basic_string<>::_Rep::
	_M_set_length_and_sharable): Do not write the empty rep.
	(basic_string<>::erase(iterator, iterator)): Likewise,
	move out of line...
	* include/bits/basic_string.tcc: ... here.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/basic_string.h
    trunk/libstdc++-v3/include/bits/basic_string.tcc

Comment 10 Paolo Carlini 2009-06-23 12:39:38 UTC
I think this is for mainline only. In a couple of few weeks a I will give it a second thought for 4_4-branch but certainly isn't a regression.
Comment 11 Bart Van Assche 2010-04-07 17:54:43 UTC
(In reply to comment #10)
> I think this is for mainline only. In a couple of few weeks a I will give it a
> second thought for 4_4-branch but certainly isn't a regression.
 
Any status update here ? Many Helgrind and DRD users would be grateful if this could be backported to gcc 4.4.
Comment 12 Paolo Carlini 2010-04-07 18:11:23 UTC
In my opinion it's too late now, I'm not even sure a 4.4.4 will be released any time soon, and 4.5.0 is around the corner. But if Jon would also like to see it in 4.4.4 and wants to backport the patch I do not object.
Comment 13 Jonathan Wakely 2010-04-07 19:06:32 UTC
I agree it's probably not worth backporting to a release branch, it wasn't a regression.

Haven't affected users written valgrind suppression files by now? :)
Comment 14 Bart Van Assche 2010-04-08 06:03:44 UTC
(In reply to comment #13)
> I agree it's probably not worth backporting to a release branch, it wasn't a
> regression.
> 
> Haven't affected users written valgrind suppression files by now? :)

Since the races on _S_empty_rep_storage are triggered by an inline function many different suppression patterns would have to be added in order to suppress these races. And adding such suppression patterns would suppress several real races.
Comment 15 Paolo Carlini 2010-04-08 09:33:34 UTC
Re-open to..
Comment 16 Paolo Carlini 2010-04-08 09:33:59 UTC
unassign myself
Comment 17 Paolo Carlini 2010-04-08 09:34:23 UTC
.
Comment 18 Bart Van Assche 2010-04-08 10:28:44 UTC
(In reply to comment #12)
> In my opinion it's too late now, I'm not even sure a 4.4.4 will be released any
> time soon, and 4.5.0 is around the corner. But if Jon would also like to see it
> in 4.4.4 and wants to backport the patch I do not object.
 
Please keep in mind that this behavior can have a severe negative impact on multithreaded C++ software that uses class std::string intensively because the unnecessary writes to _S_empty_rep_storage will cause unnecessary cache line bouncing.
Comment 19 Paolo Carlini 2010-04-08 10:33:32 UTC
Please keep in mind that nobody complained so far, over the last ~10 years, and 4.4 is close to its end of life.
Comment 20 Jonathan Wakely 2010-04-08 13:51:17 UTC
You can still reduce the helgrind output with just one suppression

{
  libstdcxx_std_string_race_pr40518
  Helgrind:Race
  fun:_ZNSs9_M_mutateEmmm
}

I'm not convinced the impact is "severe", COW strings already have problems in multiprocessor systems, I don't think erase() on empty strings is the bottleneck!   It's not been a problem in my own experience, and I'm not aware of any distros having felt the need to backport the patch to their 4.4 packages.

I'll look into backporting it tonight on condition you cease the hyperbole
Comment 21 Paolo Carlini 2010-04-08 14:04:10 UTC
> I'll look into backporting it tonight on condition you cease the hyperbole
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^        

That's why (among other reasons) I voted in favor of appointing Jon maintainer ;) Agreed.

Comment 22 Jonathan Wakely 2010-04-09 08:03:05 UTC
I'm unable to bootstrap 4.4 for some reason, I'll try again tomorrow
Comment 23 Jonathan Wakely 2010-04-10 13:46:37 UTC
Subject: Bug 40518

Author: redi
Date: Sat Apr 10 13:46:25 2010
New Revision: 158190

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=158190
Log:
2010-04-10  Jonathan Wakely  <jwakely.gcc@gmail.com>

	Backport:
	2009-06-23  Paolo Carlini  <paolo.carlini@oracle.com>

	PR libstdc++/40518
	* include/bits/basic_string.h (basic_string<>::_Rep::
	_M_set_length_and_sharable): Do not write the empty rep.
	(basic_string<>::erase(iterator, iterator)): Likewise,
	move out of line...
	* include/bits/basic_string.tcc: ... here.


Modified:
    branches/gcc-4_4-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_4-branch/libstdc++-v3/include/bits/basic_string.h
    branches/gcc-4_4-branch/libstdc++-v3/include/bits/basic_string.tcc

Comment 24 Jonathan Wakely 2010-04-10 13:48:02 UTC
Fixed for 4.4.4
Comment 25 Jackie Rosen 2014-02-16 13:12:27 UTC
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.