There is a (multithreaded) condition where the STL behaves different if compiled with or without _GLIBCXX_DEBUG. If _GLIBCXX_DEBUG is defined the attached program crashes. To compile the attached program the ACE (http://www.cs.wustl.edu/~schmidt/ACE.html) library which can be downloaded at http://deuce.doc.wustl.edu/Download.html is needed. I suspect that the iterators make some modifications on the map they iterate on if _GLIBCXX_DEBUG is defined. This leads to different (and in this case erroneous) behavior in multithreaded applications. Lothar --------------------------------- cut --------------------------------------- /* # compile without _GLIBCXX_DEBUG lothar@janus-w$ g++ -V 3.4.1 -O0 -g -I ${ACE_ROOT}/ -L ${ACE_ROOT}/lib/ -lACE -lstdc++ -lpthread -lrt -ldl bug.cpp -o bug lothar@janus-w$ LD_LIBRARY_PATH=${ACE_ROOT}/lib ./bug lothar@janus-w$ # compile with _GLIBCXX_DEBUG but guard with a write lock lothar@janus-w$ g++ -V 3.4.1 -O0 -g -D_GLIBCXX_DEBUG -I ${ACE_ROOT}/ -L ${ACE_ROOT}/lib/ -lACE -lstdc++ -lpthread -lrt -ldl bug.cpp -o bug lothar@janus-w$ LD_LIBRARY_PATH=${ACE_ROOT}/lib ./bug lothar@janus-w$ # compile with _GLIBCXX_DEBUG (crashes) lothar@janus-w$ g++ -V 3.4.1 -O0 -g -D_GLIBCXX_DEBUG -DCRASH -I ${ACE_ROOT}/ -L ${ACE_ROOT}/lib/ -lACE -lstdc++ -lpthread -lrt -ldl bug.cpp -o bug bug.cpp:19:2: warning: #warning will crash lothar@janus-w$ LD_LIBRARY_PATH=${ACE_ROOT}/lib ./bug Segmentation fault lothar@janus-w$ */ #include <algorithm> #include <map> #include "ace/Task.h" #include "ace/Barrier.h" #define PROTECT 1 #ifdef _GLIBCXX_DEBUG #ifdef CRASH #undef PROTECT #define PROTECT 0 #warning will crash #endif /* CRASH */ #endif /* _GLIBCXX_DEBUG */ class MapTask : public ACE_Task_Base { public: MapTask( unsigned long cnt, unsigned long threadCnt ) : m_barrier(threadCnt+1), m_id(0), m_failed(0), m_cnt(cnt), m_threadCnt(threadCnt) { for(cnt = 0; cnt < m_cnt; ++cnt) { m_map[cnt] = 0; } this->activate( THR_NEW_LWP|THR_JOINABLE|THR_INHERIT_SCHED, m_threadCnt ); } ~MapTask() { } virtual int init (int argc, char *argv[]) { m_barrier.wait(); return 0; } virtual int fini (void) { this->wait(); return 0; } unsigned failed ( ) { return m_failed.value(); } virtual int svc (void) { long id = m_id++; m_barrier.wait(); try { for(int cnt = 0; cnt < m_cnt; ++cnt) { if(id % 2) { WriteGuard guard(m_mutex); m_map[cnt] += 1; } std::map<int,int>::const_iterator iter = m_map.end(); { ReadGuard guard(m_mutex); iter = m_map.find(cnt); } if(iter == m_map.end()) { ++m_failed; } else { unsigned long value = 0; do { value += iter->second; { // we have to guard the iterator movement // in the debug builds due to _GLIBCXX_DEBUG #if PROTECT WriteGuard guard(m_mutex); #endif /* PROTECT */ ++iter; } }while(iter != m_map.end()); } ACE_OS::sleep(ACE_Time_Value(0,10)); } } catch(...) { return -1; } return 0; } private: typedef ACE_Read_Guard < ACE_RW_Thread_Mutex > ReadGuard; typedef ACE_Write_Guard < ACE_RW_Thread_Mutex > WriteGuard; ACE_Thread_Barrier m_barrier; ACE_RW_Thread_Mutex m_mutex; ACE_Atomic_Op <ACE_Thread_Mutex, unsigned long> m_id; ACE_Atomic_Op <ACE_Thread_Mutex, unsigned long> m_failed; std::map<int,int> m_map; unsigned long m_cnt; unsigned long m_threadCnt; }; int main () { unsigned int numThreads = 200; unsigned int numIterations = 100; // create threads MapTask task(numIterations, numThreads); ACE_OS::sleep(1); // run threads task.init(0, 0); // wait for completion task.fini(); return 0; } /* end of main */ ------------------------------------- cut -----------------------------------
Invalid as std::map is not multithreaded that way (you would have the same problem without _GLIBCXX_DEBUG also as it is not done this way for either). What this is showing you is that you need to add the write guard other wise you get a crash or wrong answers without _GLIBCXX_DEBUG.
I did some reading in the source and the culprit is that _M_attach, _M_detach, ... are not thread safe. as iter++ creates a temporary iterator (that has to attach and detach) and they access the (shared) map, it failes in the multithreaded case. As these calls are ONLY executed when _GLIBCXX_DEBUG is defined, the program only fails there. I think this clearly IS a bug and should be solved by protecting the _M_attach, _M_detach, ... methods with a mutex.
oops I wanted reopen, not verify
Hi. I agree that something, somewhere, seems fishy in debug mode vs MT: we'll see... However, your testcase involves ++iter *not* iter++ ;)
Lothar, do you have specific source lines, or a patch you can propose? That would probably speed things up. Your test program is quite large. Is there anyway to reproduce this without all of ACE? If so, that also would be useful. -benjamin
No, unfortunaltely I had no time to do further work. It is also a lot of work (at least for me) to replace ACE and use pthreads natively, as ACE provides a lot of wonderful abstractions (that keep the code shorter than with native calls). However, if you are on Linux and don't want to compile ACE yourself, there's packages (deb, rpm) as well. Just google for them. Once you have ACE compiling and running the example is easy (it contains the compile and link command lines). Our "workaround" currently is simply not to use "_GLIBCXX_DEBUG". I think basically the problem is clear. As the debug versions of the containers/iterators access some shared data in the MT case (the iterator (de)registration is one such case) This data has to be protected by a mutex or something similar. However I had no time to create a patch for the code. As stated above we just switched to non debug STL for the time being. Lothar
Created attachment 7454 [details] proposed patch Here's how you'd fix it by adding mutexes. I don't know if all of these are necessary, because I haven't bothered to reproduce it. If you can test this patch and see if it fixes your fail I'd appreciate it. Is there a way to get this to fail without ACE? -benjamin
Mine.
Compiled like so, after setting LD_LIBRARY_PATH to ACE libs: g++ -g -O2 -pthread -I$bld/H-x86-ACE/include -L$bld/H-x86-ACE/lib -lACE 17664.cc As suspected, only the safe_iterator bits in the original patch needed to go in. However, this did fix this stuff up. I'm checking into head, will fix for 3.4.4. -benjamin
Fixed on the mainline, queued for 3.4.4.
Subject: Bug 17664 CVSROOT: /cvs/gcc Module name: gcc Changes by: bkoz@gcc.gnu.org 2004-11-08 20:47:25 Modified files: libstdc++-v3 : ChangeLog libstdc++-v3/src: debug.cc Log message: 2004-11-08 Benjamin Kosnik <bkoz@redhat.com> Doug Gregor <dgregor@cs.indiana.edu> PR libstdc++/17664 * src/debug.cc: Just use one mutex. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.2756&r2=1.2757 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/debug.cc.diff?cvsroot=gcc&r1=1.9&r2=1.10
Fixed mainline, 3.4.branch
Fixed
Subject: Bug 17664 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_4-branch Changes by: bkoz@gcc.gnu.org 2004-11-08 21:10:17 Modified files: libstdc++-v3 : ChangeLog libstdc++-v3/src: debug.cc Log message: 2004-11-08 Benjamin Kosnik <bkoz@redhat.com> Doug Gregor <dgregor@cs.indiana.edu> PR libstdc++/17664 * src/debug.cc : Just use one mutex. 2004-11-08 Benjamin Kosnik <bkoz@redhat.com> Lothar Werzinger <lothar@xcerla.com> PR libstdc++/17664 * src/debug.cc: Include concurrence, use mutexes. (_Safe_iterator_base::_M_attach): Here. (_Safe_iterator_base::_M_detach): Here. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.2224.2.197&r2=1.2224.2.198 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/debug.cc.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3.10.4&r2=1.3.10.5