This is GCC Bugzilla
This is GCC Bugzilla Version 2.20+
View Bug Activity | Format For Printing | Clone This Bug
Hello, I noticed the following problem while porting an internal C++ application from linux to Cygwin. If a std::string instance created in one module (exe or dll) is passed to another say as an argument of a function call, the program crashes or hangs. I did debug for a while and it turned out that it is not the program itself that causes the crash - the cause lies in the std::string implementation, the fact that libstdc++ is provided as a static archive and that it is compiled without _GLIBCXX_FULLY_DYNAMIC_STRING defined. What happens is that each module which links agains libstdc++ get its very personal copy of the class member _S_empty_rep_storage. Now since _GLIBCXX_FULLY_DYNAMIC_STRING is not defined an empty string instance i.e. one that is created by the default constructor of std::string gets a pointer to _S_empty_rep_storage i.e. the actual allocation of memory is delayed until memory is really needed. If _GLIBCXX_FULLY_DYNAMIC_STRING is defined each string instance would get a pointer to a newly heap allocated block of memory instead . Now look at the ouput of gdb and nm used on the attached testcase: Administrator@mordor ~/src/testcase/tc $ nm dll.dll | grep _S_empty_rep_storage 1000c030 d .data$_ZNSs4_Rep20_S_empty_rep_storageE 1000c030 D __ZNSs4_Rep20_S_empty_rep_storageE Administrator@mordor ~/src/testcase/tc $ nm main.exe | grep _S_empty_rep_storage 00442120 d .data$_ZNSs4_Rep20_S_empty_rep_storageE 00442120 D __ZNSs4_Rep20_S_empty_rep_storageE === Breakpoint 1, main (argc=1, argv=0x10042980) at main.cc:9 9 { (gdb) n 10 string s, s1; (gdb) 12 export1 (s); (gdb) print s $1 = {static npos = 4294967295, _M_dataplus = {<allocator<char>> = {<new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x44212c ""}} (gdb) print s1 $2 = {static npos = 4294967295, _M_dataplus = {<allocator<char>> = {<new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x44212c ""}} === Here the two strings share _S_empty_rep_storage. === (gdb) step export1 (s=@0x22eea0) at dll.cc:7 7 string s1; (gdb) n 9 s1.push_back ('A'); (gdb) n 10 s.push_back ('A'); (gdb) print s $3 = (string &) @0x22eea0: {static npos = 4294967295, _M_dataplus = {<allocator<char>> = {<new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x44212c ""}} (gdb) print s1 $4 = {static npos = 4294967295, _M_dataplus = {<allocator<char>> = {<new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x10042aec "A"}} (gdb) === Here s1 points to the dll local _S_empty_rep_storage. The _M_p member of _M_dataplus is pointing to different copies of _S_empty_rep_storage - one stored in the executable which calls the dll and another one in the dll itself. Now the second push_back() call in export1 () will end up calling _M_mutate() to actually allocate storage. _M_mutate() will call _M_rep()->_M_dispose() which will end up free()-ing the memory reserved for _S_empty_rep_storage in the main exe. There is a check to prevent free()-ing _S_empty_rep_storage: #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING if (__builtin_expect(this != &_S_empty_rep(), false)) #endif .... but it doesn't work well in the case when a string instance created in one module is passed to another and libstdc++ is statically linked because of the fact that each module has its own copy _S_empty_rep_storage. Can we get this fixed ?
Created an attachment (id=9877) [edit] Testcase which demostrates the problem
Maybe it is better just to support shared (dynamic) Libraries on win32 too.
Is there any specific reason that _GLIBCXX_FULLY_DYNAMIC_STRING is not defined on Cygwin ?
(In reply to comment #3) > Is there any specific reason that _GLIBCXX_FULLY_DYNAMIC_STRING is not defined > on Cygwin ? Not having really looked into this issue, the reason is however obvious: the use of _S_empty_rep_storage represents a very good optimization without negative side effects for many (I would say, most) applications of the c++ library. As the string class is currently designed, _GLIBCXX_FULLY_DYNAMIC_STRING should be definitely undefined by default. By the way, for gcc4.1 we will provide an alternate "versatile" string class which doesn't have this problem, but in any case cannot become immediately our default string class implementation due to ABI stability (you can already play with it using CVS gcc: <ext/vstring.h>)
Paolo, I guess it is obvious to anyone around that _S_empty_rep_storage is there to speed up things. What is not obvious to Cygwin users is that this optimization actually makes their programs crash. And let me assure you that the crash is not easy to track down unless you want to step trough assembler code. Anyway, what I meant was that it may be good to have _GLIBCXX_FULLY_DYNAMIC_STRING undefined for the common case but not for the current libstdc++ on Cygwin i.e. libstdc++ comes as a static library. Maybe this question concerns only Cygwin and is not of general interest and as such should be discussed only on the Cygwin mailing list.
(In reply to comment #5) > Paolo, I guess it is obvious to anyone around that _S_empty_rep_storage is > there to speed up things. What is not obvious to Cygwin users is that this > optimization actually makes their programs crash. And let me assure you that > the crash is not easy to track down unless you want to step trough assembler > code. I believe you, but I'm definitely not sure that giving away by default this important optimization for the sake of the few people playing with dlls is the way to go for Cygwin. Maybe the fact that in so many years nobody reported problems before should tell you something... Anyway, you are probably right, you should first discuss the issue in the Cygwin community and weight the various options. Then, if a change to the configury localized to Cygwin will be in order rest assured that will be very quickly approved. Thanks.
PS: if you are going to experiment with _GLIBCXX_FULLY_DYNAMIC_STRING defined by default, I would suggest also testing the performance impact of the following: change the default constructor to allocate memory on the heap not only for the '\0', but also for a given small capacity > 0, say 15 chars. This would recover in some cases the initial overhead by not freeing and allocating memory at the first real usage.
I'm thinking that another possibility would be avoiding only *part* of the optimization scheme that relies on _S_empty_rep_storage, that is return to the behavior pre-2003-06-13. The empty string object would be reference counted, as any other string, no special cases and no addresses at destruction time. Seems a good compromise for some targets. Are you willing to test a patch? I can prepare one vs 3.4.x but consider that that branch is no longer maintained and any change would eventually go in mainline (and, possibly, 4_0-branch) only.
I will test any patch that resolves the issue. In any case the final decision will be that of the Cygwin gcc maintainer. But what you suggests sounds better than just using --enable-fully-dynamic-string.
(In reply to comment #9) > I will test any patch that resolves the issue. In any case the final decision > will be that of the Cygwin gcc maintainer. But what you suggests sounds better > than just using --enable-fully-dynamic-string. Excellent. I will prepare one against 3_4-branch, for your ease. Will not include any configury bits, for now, just apply and rebuild. Give me 1-2 days max...
Created an attachment (id=9911) [edit] First draft vs 3_4-branch
I'm attaching a first draft implementing what I have in mind. Assuming testing goes well, the issue of binary compatibility should be evaluated: whereas the library-ABI is preserved, in general object code build with the current lib and with the new one will not be able to safely exchange empty strings. I don't how much this is an issue for the Cygwin project (certainly it would be for a linux distribution). Consider, anyway, that we are talking about a new configure option and those can break the binary compatibility in any way (like changing the locale model, for instance), we don't provide guarantees in those cases. We can try to figure out ways to improve that, but I'm not super confident we can make it. If binary compatibility is a major requirement, then the --enable-fully-dynamic-string option would be ok, maybe improved along the way I mentioned yesterday.
Both the simple testcase and the program I am working on work fine with your patch.
(In reply to comment #13) > Both the simple testcase and the program I am working on work fine with your > patch. Thanks. Actually, I have to think a bit more about the idea. I'm not sure that there are binary compatibility problems, maybe not ;)
I'm new maintainer for Cygwin gcc. I'll be rolling a release with the patch Paolo proposed rather than using the configure option, although if binary compatibility problems do crop up I'll look at the second route. DaveK
Ok. Hopefully, before the end of this week I can tell you something trustworthy about binary compatibility.
On mingw32 the testcase will succeed on trunk if libstdc++ (and libgcc) are built as dlls. Wouldn't that be the preferred solution? It also solves very similar problems with EH data. Danny
(In reply to comment #16) > Ok. Hopefully, before the end of this week I can tell you something trustworthy > about binary compatibility. > Have you found a solution for it? On w64 target 4.4 (and 4.5) the problem seems to exist still. At least the patch you sent solved it (I had to adjust it a bit). Cheers, Kai
Sorry, but at this point in history, it's unlikely that we are going to do much to the current std::string, given all the ABI implications. When we'll break the ABI, for C++1x, a completely different implementation will be used, neither affected by this issue nor by many others (ext/vstring is close, and already available).
I think the best solution to this problem is to enable libstdc++ as a DLL, and export _S_empty_rep from it. Then every C++ DLL and EXE will link against libstdc++ and they'll all import the exact same instance. Problem solved *and* we get to keep the optimisation. The more difficult problem of windows shared-libraries not allowing undefined references (and how this interacts with RTTI and COMDATs in particular) is a larger issue which will need to be tackled separately. I'm about to test a patch based on this approach.
I'm actually getting this same error on Snow Leopard (Mac OS X 10.6.0). It's pretty easy to reproduce with Boost (1.40): #include <string> #include <sstream> #include <boost/variant.hpp> int main() { std::ostringstream buf; boost::variant<bool, std::string> data; data = buf.str(); data = false; return 0; } $ g++ -I/opt/local/include -o bug bug.cc What happens here is that ostringstream, which is in libstdc++.dylib, returns an empty string which does not match the notion of empty string compiled into my executable; thus when Boost tries to deconstruct the string it stored in order to store the boolean, it crashes trying to deallocate a foreign lib's empty string. This doesn't happen with the same Boost using the stock compiler (g++ 4.2). John
A little more data: With the stock compiler, g++ 4.2.1: vulcan /tmp $ otool -L bug bug: /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 123.0.0) vulcan /tmp $ nm -o /usr/lib/libstdc++.6.dylib | grep empty_rep 0003f200 T __ZNSbIwSt11char_traitsIwESaIwEE12_S_empty_repEv 0003f2fd T __ZNSbIwSt11char_traitsIwESaIwEE4_Rep12_S_empty_repEv 0006fe50 S __ZNSbIwSt11char_traitsIwESaIwEE4_Rep20_S_empty_rep_storageE 0002fae3 T __ZNSs12_S_empty_repEv 0002fbe7 T __ZNSs4_Rep12_S_empty_repEv 0006fdb0 S __ZNSs4_Rep20_S_empty_rep_storageE vulcan /tmp $ nm bug | grep empty_rep 0000000100001fc7 T __ZNSs12_S_empty_repEv 0000000100001f14 T __ZNSs4_Rep12_S_empty_repEv 000000010000dd00 D __ZNSs4_Rep20_S_empty_rep_storageE With the compiler built from MacPorts, g++ 4.4.2: vulcan /tmp $ otool -L bug bug: /opt/local/lib/gcc44/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.13.0) /opt/local/lib/gcc44/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 123.0.0) vulcan /tmp $ nm /opt/local/lib/gcc44/libstdc++.6.dylib | grep empty_rep 00000000000655c0 T __ZNSbIwSt11char_traitsIwESaIwEE12_S_empty_repEv 00000000000658c0 T __ZNSbIwSt11char_traitsIwESaIwEE4_Rep12_S_empty_repEv 00000000000bbe40 D __ZNSbIwSt11char_traitsIwESaIwEE4_Rep20_S_empty_rep_storageE 000000000004de20 T __ZNSs12_S_empty_repEv 000000000004e0e0 T __ZNSs4_Rep12_S_empty_repEv 00000000000bbdc0 D __ZNSs4_Rep20_S_empty_rep_storageE vulcan /tmp $ nm bug | grep empty_rep 000000010000193b T __ZNSs4_Rep12_S_empty_repEv 0000000100007820 D __ZNSs4_Rep20_S_empty_rep_storageE
I should also mention, this discrepancy only occurs when _GLIBCXX_DEBUG=1.
The Snow Leopard case is different, apparently they decided to build the system compiler with --enable-fully-dynamic-string and then that shows up everywhere, see 41645. As far as we can understand so far, people willing to experiment with recent GCCs on Snow Leopard have to use that configure time switch.