[ Re libstdc++/5347 and libstdc++/5037 ] In our project which is heavily multi-threaded we would occasionally get randon crashes which were associated with the iostreams code. Managed to get it down to a very small test case which indicates the problems. [ I'm also not sure that the file attachments have worked, I'm sending them to the gcc-mailing list as well.... ] Release: gcc-3.0.4-cvs-20020116 Environment: System: SunOS primo 5.6 Generic_105181-26 sun4u sparc SUNW,Ultra-4 Architecture: sun4configured with: /disks/nermal/andrewp/cvs/gcc-3_0-branch/configure --prefix=/opt/gcc-3.0.4-20020116-sparc-sun-solaris2.6-gnu-asld --with-gnu-as --with-gnu-ld --enable-threads=solaris --enable-version-specific-runtime-libs --disable-shared --enable-languages=c++ [ This is on a 4 processor machine ] How-To-Repeat: Compile and run the attached program with gcc-3.0.x on a multiprocessor system (in this case UltraSparc/Solaris6, but similar things happen on a dual processor i686/Redhat7.2). Sometimes it will work, sometimes you will get a Bus Error and sometimes a Segmentation Fault % /opt/gcc-3.0.4-20020116-sparc-sun-solaris2.6-gnu-asld/bin/g++ -g main.cxx -lpthread % a.out % a.out % a.out Bus error (core dumped) % gdb a.out core [...] (gdb) where #0 0x00020270 in std::__ctype_abstract_base<char>::widen(char) const () at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/locale_facets.h:97 #1 0x0001fdd4 in std::basic_ios<char, std::char_traits<char> >::widen(char) const (this=0xef109ccc, __c=32 ' ') at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/locale_facets.h:97 #2 0x0001f850 in std::basic_ios<char, std::char_traits<char> >::init(std::basic_streambuf<char, std::char_traits<char> >*) (this=0xef109ccc, __sb=0x0) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/basic_ios.tcc:125 #3 0x0001f918 in std::istream::istream(std::basic_streambuf<char, std::char_traits<char> >*) () at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/std_istream.h:74 #4 0x0001f4e8 in std::iostream::iostream(std::basic_streambuf<char, std::char_traits<char> >*) (this=0xef109ccc, __vtt_parm=0x4be1c, __sb=0x0) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/std_istream.h:274 #5 0x0001f318 in std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream(std::_Ios_Openmode) (this=0xef109c70, __m=24) at /opt/gcc-3.0.4-20020116-sparc-sun-solaris2.6-gnu-asld/lib/gcc-lib/sparc-sun-solaris2.6/3.0.4/include/g++/bits/std_sstream.h:331 #6 0x000113ac in threadFunc () at main.cxx:11 (gdb) quit % a.out % a.out % a.out Segmentation fault (core dumped) % gdb a.out core [...] (gdb) where #0 0x00000000 in ?? () #1 0x00017678 in std::locale::facet::_M_remove_reference() (this=0x59d48) at /disks/nermal/andrewp/cvs/gcc-3_0-branch/libstdc++-v3/src/locale.cc:537 #2 0x00019c90 in std::locale::_Impl::~_Impl() (this=0x592d8) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/stl_iterator.h:478 #3 0x00023fc4 in std::locale::_Impl::_M_remove_reference() (this=0x592d8) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/localefwd.h:332 #4 0x00016fc0 in std::locale::operator=(std::locale const&) (this=0xef109d30, __other=@0xef1099f8) at /disks/nermal/andrewp/cvs/gcc-3_0-branch/libstdc++-v3/src/locale.cc:406 #5 0x00016224 in std::ios_base::_M_init() (this=0xef109d30) at /disks/nermal/andrewp/cvs/gcc-3_0-branch/libstdc++-v3/src/ios.cc:267 #6 0x0001f824 in std::basic_ios<char, std::char_traits<char> >::init(std::basic_streambuf<char, std::char_traits<char> >*) (this=0xef109ccc, __sb=0x0) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/basic_ios.tcc:122 #7 0x0001f918 in std::istream::istream(std::basic_streambuf<char, std::char_traits<char> >*) () at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/std_istream.h:74 #8 0x0001f4e8 in std::iostream::iostream(std::basic_streambuf<char, std::char_traits<char> >*) (this=0xef109ccc, __vtt_parm=0x4be1c, __sb=0x0) at /disks/primo4/workplaces/andrewp/gnu/gcc-3.0.4-20020116-build/sparc-sun-solaris2.6/libstdc++-v3/include/bits/std_istream.h:274 #9 0x0001f318 in std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream(std::_Ios_Openmode) (this=0xef109c70, __m=24) at /opt/gcc-3.0.4-20020116-sparc-sun-solaris2.6-gnu-asld/lib/gcc-lib/sparc-sun-solaris2.6/3.0.4/include/g++/bits/std_sstream.h:331 #10 0x000113ac in threadFunc () at main.cxx:11 (gdb) quit
Fix: With the attached patch, the attached program 'always' seems to work (I had it in a while loop running for about an hour...) I don't make any guarantees for the correctness of the patch, but it definitely addresses some of the 'XXX MT' issues in the source code (eg using _Atomic_word, __atomic_add, __exchange_and_add and a lock) The gcc-3.1 cvs code is very similar, if not the same...
Responsible-Changed-From-To: unassigned->ljrittle Responsible-Changed-Why: Taking at Ben's request.
State-Changed-From-To: open->analyzed State-Changed-Why: Wacking patch into final shape (addressing issue of only zero PODs being guaranteed to be initiallized before main() starts).
State-Changed-From-To: analyzed->feedback State-Changed-Why: Andrew, there were two minor issues with your patch: (And it took collective thought to figure all this out so don't feel bad. ;-) In general, static _Atomic_word should always be init'd to 0 with C++. We think g++ follows the init order rules of C which are tighter than C++ but for C++ non-zero static values may not be init'd until after main() has run (the only rule says it must be done "before the related block [scope] is entered" which might be after threads were started). Secondly: < if (--_M_references == 0) // XXX MT --- > if (__exchange_and_add(&_M_references, -1) == 1) is correct (you had == 0). I.e. assuming it was correct as written (other than thread-safety) your rewrite had a memory leak. Other than that, Nathan and I have reviewed the patch; I have tested it and installed it (that report to the libstdc++ mailing list). To be closed after you confirm MP *-*-linux* system fixed as installed and then I move it to 3.0.X branch. Thanks, Loren
From: Andrew Pollard <andrewp@andypo.net> To: ljrittle@gcc.gnu.org, andrew@andypo.net, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, rodrigc@gcc.gnu.org, gcc-gnats@gcc.gnu.org Cc: Subject: [andrewp@andypo.net: Re: libstdc++/5432: Implementation still not thread-safe on multiprocessor machines] Date: Wed, 30 Jan 2002 01:31:11 GMT ------- Start of forwarded message ------- Date: Sat, 26 Jan 2002 16:56:52 GMT From: Andrew Pollard <andrewp@andypo.net> To: ljrittle@gcc.gnu.org, andrew@andypo.net, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, rodrigc@gcc.gnu.org, gcc-gnats@gcc.gnu.org In-reply-to: <20020124212600.20158.qmail@sources.redhat.com> (ljrittle@gcc.gnu.org) Subject: Re: libstdc++/5432: Implementation still not thread-safe on multiprocessor machines Sorry to repost... I hope first attempt didn't make it out of my machine.. Haven't seen ay responses.... Hope this can make it in for 3.0.4... Andrew. >Synopsis: Implementation still not thread-safe on multiprocessor machines > >State-Changed-From-To: analyzed->feedback >State-Changed-By: ljrittle >State-Changed-When: Thu Jan 24 13:25:59 2002 >State-Changed-Why: > Andrew, there were two minor issues with your patch: > (And it took collective thought to figure all this out > so don't feel bad. ;-) :-) Just glad to have helped. > In general, static _Atomic_word should always be init'd > to 0 with C++. We think g++ follows the init order > rules of C which are tighter than C++ but for C++ > non-zero static values may not be init'd until after main() > has run (the only rule says it must be done "before > the related block [scope] is entered" which might be after > threads were started). I didn't realise this. I had assumed that C++ followed the C rules for 'basic' types... > Secondly: > > < if (--_M_references == 0) // XXX MT > --- > > if (__exchange_and_add(&_M_references, -1) == 1) > > is correct (you had == 0). I.e. assuming it was correct > as written (other than thread-safety) your rewrite had a > memory leak. Interesting. This is actually a difference between the gcc-3.0 branch and the cvs head. I wrote the patch for gcc-3.0.x, and that branch has the test if (_M_references-- == 0) // XXX MT So I was right :-) [ But the branch is probably wrong... :-) ] Benjamin Kosnik changed the head cvs in late June last year.... > Other than that, Nathan and I have reviewed the patch; > I have tested it and installed it (that report to > the libstdc++ mailing list). To be closed after you > confirm MP *-*-linux* system fixed as installed and > then I move it to 3.0.X branch. Thanks, Loren Brilliant. I can confirm that the gcc cvs head as of today (saturday 26th) compiled and passed the test program on my dual processor i686 Redhat7.2 system. Andrew. - -- Andrew Pollard, ASI/Brooks Automation | home: andrew@andypo.net 670 Eskdale Road, Winnersh Triangle, UK | work: Andrew.Pollard@brooks.com Tel/Fax:+44 (0)118 9215603 / 9215660 | http://www.andypo.net ------- End of forwarded message -------
State-Changed-From-To: feedback->closed State-Changed-Why: Fixed in recent gcc 3.1 snapshots.
From: Andrew Pollard <andrewp@andypo.net> To: rodrigc@gcc.gnu.org, gcc-gnats@gcc.gnu.org Cc: andrew@andypo.net, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, ljrittle@gcc.gnu.org Subject: Re: libstdc++/5432: Implementation still not thread-safe on multiprocessor machines Date: Mon, 18 Feb 2002 23:34:46 GMT >Synopsis: Implementation still not thread-safe on multiprocessor machines > >State-Changed-From-To: feedback->closed >State-Changed-By: rodrigc >State-Changed-When: Sun Feb 17 18:54:57 2002 >State-Changed-Why: > Fixed in recent gcc 3.1 snapshots. > >http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=5432 Cheers. I have verified that the 3.1 source "works as expected". Shame it couldn't make it into gcc-3.0.4, I'll just have to patch locally once gcc-3.0.4 is out... Andrew. -- Andrew Pollard, ASI/Brooks Automation | home: andrew@andypo.net 670 Eskdale Road, Winnersh Triangle, UK | work: Andrew.Pollard@brooks.com Tel/Fax:+44 (0)118 9215603 / 9215660 | http://www.andypo.net