Bug 5444 - in multi-processor environment basic_string ist not thread safe
Summary: in multi-processor environment basic_string ist not thread safe
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 3.0.3
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-01-21 12:36 UTC by markus.breuer
Modified: 2004-03-31 23:17 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
stringtest.cpp (871 bytes, application/octet-stream)
2003-05-21 15:17 UTC, markus.breuer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description markus.breuer 2002-01-21 12:36:00 UTC
This bug is not a clone of libstdc++/5037

I tested gcc 3.0.3 with a small sample application when the memory allocation became corrupt. After some time the process size grows up to maximum of available memory. Then the application runs out of memory und tries to write a core dump. Because of less free disk space this core is not comletly ritten.
I also applied the fix for atomcity (not yet released, but in 3.1-stream), but it seems not to solve the problem. When starting the application in a single processor machine, there are no problems.

In my opinion there are two possible problems:

1. The atomicity works fine, but because of the mass of strings the reference counter becomes an overflow. I could not see any protection in the context of basic_string. But I also could not find any hint (within the debugger) that an overflow may occure. The conditional breakpoint at the reference counter never reached a size of 1000 or more.

2. The atomicity does not work correct or there are accesses which are not synchronized. The changing accesses are done by using atomicity.h. But simple read accesses are done directly (see at header for example in _M_is_shared() in bits/basic_string.h.)

I allready mailed nathan myers, who was involved in the basic_string implementation. I am not sure if he found any error, but I am really sure that gcc 3.0.3 (with the patch you described) is not stable when running my application.

Currently I am using a mix of 2.95.3 and 3.x-atomicity. This mix runs without any problem. But because of the new implementation I am not able to apply my changes to the new implementation, nearly everthing is new.


I think the atomic implementations behavior is like following:

/* !!! atomic uses more than on instruction to implement mutex !!! */
/* lock */
do { 
   current = atomics_value; // a context switch to an thread running on another processor 
				    // will force the other thread to loop until current thread unlocks
                            // the mutex => active waiting
} while( current != 0 );

Release:
shipped with gcc 3.0.3 (3.0.951ß9

Environment:
SUN Solaris 2.8
multi-processor machine

How-To-Repeat:
It seems there is a relationship between processor idle time and the core dumps. To repeat the error try to start two or three instances of the application. Open a further shell running top-command (interval 1s) and watch the applications memory usage. After about half a minute one processes memory usage will grow.
Comment 1 markus.breuer 2002-01-21 12:36:00 UTC
Fix:
No real idea.
Comment 2 Loren Rittle 2002-01-21 13:51:04 UTC
Responsible-Changed-From-To: unassigned->ljrittle
Responsible-Changed-Why: Mine.
Comment 3 Loren Rittle 2002-01-21 13:51:04 UTC
State-Changed-From-To: open->feedback
State-Changed-Why: I couldn't reproduce this problem on a sparc-sun-solaris2.7
    dual-processor machine against gcc 3.0.4 prerelease.
    
    I saw no boundless memory growth or crashes with
    1 or 2 copies of the test program running.
Comment 4 Loren Rittle 2002-01-22 14:13:19 UTC
Responsible-Changed-From-To: ljrittle->unassigned
Responsible-Changed-Why: Doesn't look as related to the part of the code base
    I understand as I originally thought.
Comment 5 Loren Rittle 2002-01-22 14:13:19 UTC
State-Changed-From-To: feedback->analyzed
State-Changed-Why: Agreed, I can reproduce the reduced test case failure
    on a two-way MP sparc-sun-solaris2.7 machine using
    gcc 3.0.3+"sparc atomicity patch".  Parts of 
    the reduced test case violate libstdc++-v3 threading
    rules but I agree that ``std::ostringstream oss''
    should be concurrently executable in different threads
    and is the root cause of a core dump.
Comment 6 Volker Reichelt 2002-01-22 17:49:10 UTC
From: Reichelt <reichelt@igpm.rwth-aachen.de>
To: gcc-gnats@gcc.gnu.org, ljrittle@gcc.gnu.org, markus.breuer@materna.de,
        gcc-bugs@gcc.gnu.org
Cc:  
Subject: Re: libstdc++/5444: in multi-processor environment basic_string ist not thread safe
Date: Tue, 22 Jan 2002 17:49:10 +0100

 Hi,
 
 I can confirm the problems with the example in PR5444.
 I compiled the program with gcc 3.1-20020121 (configured with
 "--enable-threads") on a dual i686-pc-linux-gnu box.
 Running it I get segfaults most of the time, but no memory growth.
 I only need to run one instance to get the segfault.
 
 I tried to reduce the exapmle a little bit and to get rid of the
 deprecated header <strstream> and came up with to following example
 that crashes within less than a second most of the time
 (just compile with "g++ filename.cpp -lpthread"):
 
 #include <pthread.h>
 #include <unistd.h>
 #include <iostream>
 #include <sstream>
 
 const int max_thread_count = 8;
 volatile int runningThreads = max_thread_count;
 
 pthread_t tid[ max_thread_count ];
 
 void* thread_main (void*)
 {
    std::cout << "Entering thread ..." << std::endl;
 
    for (int i=0; i<10000; ++i )
       std::ostringstream oss;
 
    std::cout << "Leaving thread ..." << std::endl;
    runningThreads--;
 }
 
 
 int main()
 {
    std::cout << "Startup ..." << std::endl;
 
    for ( int i=0; i < max_thread_count; ++i )
    {
       pthread_create( &tid[i], 0, thread_main, 0 );
       std::cout << "thread " << i+1 << " started ..." << std::endl;
    }
 
    while ( runningThreads )
       sleep (1);
 
    std::cout << "Shutdown ..." << std::endl;
 
    return 0;
 }
 
 The problem seems to be hidden in the command "std::ostringstream oss;".
 If I replace it with some different dummy code, everything works fine.
 
 Greetings,
 Volker Reichelt
 
 http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=5444
 
 
Comment 7 Loren Rittle 2002-01-24 13:48:44 UTC
State-Changed-From-To: analyzed->closed
State-Changed-Why: Testcase added.  Same as libstdc++/5432.  Patch on
    mainline.