This bug is not a clone of libstdc++/5037 I tested gcc 3.0.3 with a small sample application when the memory allocation became corrupt. After some time the process size grows up to maximum of available memory. Then the application runs out of memory und tries to write a core dump. Because of less free disk space this core is not comletly ritten. I also applied the fix for atomcity (not yet released, but in 3.1-stream), but it seems not to solve the problem. When starting the application in a single processor machine, there are no problems. In my opinion there are two possible problems: 1. The atomicity works fine, but because of the mass of strings the reference counter becomes an overflow. I could not see any protection in the context of basic_string. But I also could not find any hint (within the debugger) that an overflow may occure. The conditional breakpoint at the reference counter never reached a size of 1000 or more. 2. The atomicity does not work correct or there are accesses which are not synchronized. The changing accesses are done by using atomicity.h. But simple read accesses are done directly (see at header for example in _M_is_shared() in bits/basic_string.h.) I allready mailed nathan myers, who was involved in the basic_string implementation. I am not sure if he found any error, but I am really sure that gcc 3.0.3 (with the patch you described) is not stable when running my application. Currently I am using a mix of 2.95.3 and 3.x-atomicity. This mix runs without any problem. But because of the new implementation I am not able to apply my changes to the new implementation, nearly everthing is new. I think the atomic implementations behavior is like following: /* !!! atomic uses more than on instruction to implement mutex !!! */ /* lock */ do { current = atomics_value; // a context switch to an thread running on another processor // will force the other thread to loop until current thread unlocks // the mutex => active waiting } while( current != 0 ); Release: shipped with gcc 3.0.3 (3.0.951ß9 Environment: SUN Solaris 2.8 multi-processor machine How-To-Repeat: It seems there is a relationship between processor idle time and the core dumps. To repeat the error try to start two or three instances of the application. Open a further shell running top-command (interval 1s) and watch the applications memory usage. After about half a minute one processes memory usage will grow.
Fix: No real idea.
Responsible-Changed-From-To: unassigned->ljrittle Responsible-Changed-Why: Mine.
State-Changed-From-To: open->feedback State-Changed-Why: I couldn't reproduce this problem on a sparc-sun-solaris2.7 dual-processor machine against gcc 3.0.4 prerelease. I saw no boundless memory growth or crashes with 1 or 2 copies of the test program running.
Responsible-Changed-From-To: ljrittle->unassigned Responsible-Changed-Why: Doesn't look as related to the part of the code base I understand as I originally thought.
State-Changed-From-To: feedback->analyzed State-Changed-Why: Agreed, I can reproduce the reduced test case failure on a two-way MP sparc-sun-solaris2.7 machine using gcc 3.0.3+"sparc atomicity patch". Parts of the reduced test case violate libstdc++-v3 threading rules but I agree that ``std::ostringstream oss'' should be concurrently executable in different threads and is the root cause of a core dump.
From: Reichelt <reichelt@igpm.rwth-aachen.de> To: gcc-gnats@gcc.gnu.org, ljrittle@gcc.gnu.org, markus.breuer@materna.de, gcc-bugs@gcc.gnu.org Cc: Subject: Re: libstdc++/5444: in multi-processor environment basic_string ist not thread safe Date: Tue, 22 Jan 2002 17:49:10 +0100 Hi, I can confirm the problems with the example in PR5444. I compiled the program with gcc 3.1-20020121 (configured with "--enable-threads") on a dual i686-pc-linux-gnu box. Running it I get segfaults most of the time, but no memory growth. I only need to run one instance to get the segfault. I tried to reduce the exapmle a little bit and to get rid of the deprecated header <strstream> and came up with to following example that crashes within less than a second most of the time (just compile with "g++ filename.cpp -lpthread"): #include <pthread.h> #include <unistd.h> #include <iostream> #include <sstream> const int max_thread_count = 8; volatile int runningThreads = max_thread_count; pthread_t tid[ max_thread_count ]; void* thread_main (void*) { std::cout << "Entering thread ..." << std::endl; for (int i=0; i<10000; ++i ) std::ostringstream oss; std::cout << "Leaving thread ..." << std::endl; runningThreads--; } int main() { std::cout << "Startup ..." << std::endl; for ( int i=0; i < max_thread_count; ++i ) { pthread_create( &tid[i], 0, thread_main, 0 ); std::cout << "thread " << i+1 << " started ..." << std::endl; } while ( runningThreads ) sleep (1); std::cout << "Shutdown ..." << std::endl; return 0; } The problem seems to be hidden in the command "std::ostringstream oss;". If I replace it with some different dummy code, everything works fine. Greetings, Volker Reichelt http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=5444
State-Changed-From-To: analyzed->closed State-Changed-Why: Testcase added. Same as libstdc++/5432. Patch on mainline.