Bug 42624 - libstdc++ parallel mode deadlocks in barrier
Summary: libstdc++ parallel mode deadlocks in barrier
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.4.2
: P3 normal
Target Milestone: ---
Assignee: Johannes Singler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-01-05 17:59 UTC by Török Edwin
Modified: 2010-01-27 12:35 UTC (History)
3 users (show)

See Also:
Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
Build: x86_64-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2010-01-12 14:35:01


Attachments
Removes superfluous pragma omp single twice (264 bytes, patch)
2010-01-15 14:30 UTC, Johannes Singler
Details | Diff
Add printf debug statements. (521 bytes, patch)
2010-01-15 14:30 UTC, Johannes Singler
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Török Edwin 2010-01-05 17:59:36 UTC
When building ClamAV with -D_GLIBCXX_PARALLEL -fopenmp, clamd hangs, attaching gdb to the hanged process reveals 2 threads, one is waiting in poll (normal),
the other one is hanged in gomp_team_barrier_wait_end.
Since there are no other threads that could grant it the barrier, that thread will wait there indefinetely:

  2 Thread 0x7fedad8d1910 (LWP 6147)  0x00000031798c0783 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>,
    timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
* 1 Thread 0x7fedbdc84770 (LWP 6146)  futex_wait (bar=<value optimized out>, state=<value optimized out>)
    at ../../../src/libgomp/config/linux/x86/futex.h:44

Steps to reproduce:
1. Download ClamAV snapshot:
http://git.clamav.net/gitweb?p=clamav-devel.git;a=snapshot;h=fc382bd68b9e2e14198ca05efc72fba15f1a32da;sf=tgz
2. Unpack snapshot
3. Build it with libstdc++ parallel mode:
$ ./configure CXXFLAGS=-fopenmp -D_GLIBCXX_PARALLEL LDFLAGS=-fopenmp --disable-clamav --enable-llvm=yes
$ make -j4
4. Edit etc/clamd.conf, remove "Example" line
5. Start clamd:
$ clamd/clamd -c etc/clamd.conf
6. Reload database twice:
$ clamdscan/clamdscan --reload -c etc/clamd.conf
$ clamdscan/clamdscan --reload -c etc/clamd.conf
^C(deadlocks here)
7. $ ps -ef | grep clamd
edwin     7226     1  0 19:55 ?        00:00:00 /home/edwin/clam/git/publicgitgit/clamd/.libs/lt-clamd -c etc/clamd.conf
8. Attach gdb
$ gdb 
(gdb) attach 7226
(gdb) thread apply all bt
Thread 2 (Thread 0x7f5180e31910 (LWP 7294)):
#0  0x00000031798c0783 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x000000000040ee6d in fds_poll_recv (data=0x7fff884c1e40, timeout=<value optimized out>, check_signals=0) at others.c:487
#2  0x000000000040d31f in acceptloop_th (arg=<value optimized out>) at server-th.c:320
#3  0x000000317a40673a in start_thread (arg=<value optimized out>) at pthread_create.c:300
#4  0x00000031798cb6fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f51911e4770 (LWP 7293)):
#0  futex_wait (bar=<value optimized out>, state=<value optimized out>) at ../../../src/libgomp/config/linux/x86/futex.h:44
#1  do_wait (bar=<value optimized out>, state=<value optimized out>) at ../../../src/libgomp/config/linux/wait.h:58
#2  gomp_team_barrier_wait_end (bar=<value optimized out>, state=<value optimized out>)
    at ../../../src/libgomp/config/linux/bar.c:109
#3  0x00007f51913d43d8 in __gnu_parallel::find_template<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, std::binder2nd<__gnu_parallel::equal_to<llvm::PassInfo const*, llvm::PassInfo const* const&> >, __gnu_parallel::find_if_selector> ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#4  0x00007f51913defea in std::pair<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > > > __gnu_parallel::find_template<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, std::binder2nd<__gnu_parallel::equal_to<llvm::PassInfo const*, llvm::PassInfo const* const&> >, __gnu_parallel::find_if_selector>(__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, std::binder2nd<__gnu_parallel::equal_to<llvm::PassInfo const*, llvm::PassInfo const* const&> >, __gnu_parallel::find_if_selector, __gnu_parallel::constant_size_blocks_tag) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#5  0x00007f51913dc9b1 in std::pair<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > > > __gnu_parallel::find_template<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, std::binder2nd<__gnu_parallel::equal_to<llvm::PassInfo const*, llvm::PassInfo const* const&> >, __gnu_parallel::find_if_selector>(__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, std::binder2nd<__gnu_parallel::equal_to<llvm::PassInfo const*, llvm::PassInfo const* const&> >, __gnu_parallel::find_if_selector) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#6  0x00007f51913daaff in __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > > std::__parallel::find_switch<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, llvm::PassInfo const*>(__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, llvm::PassInfo const* const&, std::random_access_iterator_tag) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#7  0x00007f51913d7f6c in __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > > std::__parallel::find<__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, llvm::PassInfo const*>(__gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, __gnu_cxx::__normal_iterator<llvm::PassInfo const* const*, std::__cxx1998::vector<llvm::PassInfo const*, std::allocator<llvm::PassInfo const*> > >, llvm::PassInfo const* const&) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#8  0x00007f51913ce5fa in llvm::PMTopLevelManager::findAnalysisPass(llvm::PassInfo const*) ()
from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#9  0x00007f51913ce2fb in llvm::PMTopLevelManager::schedulePass(llvm::Pass*) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#10 0x00007f51913d5692 in llvm::FunctionPassManagerImpl::add(llvm::Pass*) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#11 0x00007f51913d18df in llvm::FunctionPassManager::add(llvm::Pass*) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#12 0x00007f5191906e47 in llvm::LLVMTargetMachine::addCommonCodeGenPasses(llvm::PassManagerBase&, llvm::CodeGenOpt::Level) ()
   from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#13 0x00007f5191906c01 in llvm::LLVMTargetMachine::addPassesToEmitMachineCode(llvm::PassManagerBase&, llvm::JITCodeEmitter&, llvm::CodeGenOpt::Level) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#14 0x00007f519139165e in llvm::JIT::JIT(llvm::ModuleProvider*, llvm::TargetMachine&, llvm::TargetJITInfo&, llvm::JITMemoryManager*, llvm::CodeGenOpt::Level, bool) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#15 0x00007f5191391209 in llvm::JIT::createJIT(llvm::ModuleProvider*, std::string*, llvm::JITMemoryManager*, llvm::CodeGenOpt::Level, bool, llvm::CodeModel::Model) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#16 0x00007f5191391131 in llvm::ExecutionEngine::createJIT(llvm::ModuleProvider*, std::string*, llvm::JITMemoryManager*, llvm::CodeGenOpt::Level, bool, llvm::CodeModel::Model) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#17 0x00007f51913948ec in llvm::JIT::create(llvm::ModuleProvider*, std::string*, llvm::JITMemoryManager*, llvm::CodeGenOpt::Level, bool, llvm::CodeModel::Model) () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#18 0x00007f5191544b3f in llvm::EngineBuilder::create() () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#19 0x00007f519132d365 in cli_bytecode_prepare_jit () from /home/edwin/clam/git/publicgitgit/libclamav/.libs/libclamav.so.6
#20 0x00007f5191310de8 in cli_bytecode_prepare (bcs=0x1703104) at bytecode.c:1580
#21 0x00007f51912a380d in cl_engine_compile (engine=0x17009f0) at readdb.c:2614
#22 0x000000000040b0ea in reload_db (engine=0x17009f0, dboptions=8234, opts=<value optimized out>,
    do_check=<value optimized out>, ret=0x7fff884c216c) at server-th.c:231
#23 0x000000000040c426 in recvloop_th (socketds=0x0, nsockets=<value optimized out>, engine=<value optimized out>,
    dboptions=<value optimized out>, opts=<value optimized out>) at server-th.c:1265
#24 0x0000000000407b4b in main (argc=<value optimized out>, argv=<value optimized out>) at clamd.c:486
(gdb)

9. See that in the thread backtrace one thread is waiting in poll(), other one in gomp_team_barrier_end, and those are only 2 threads. 
10. Deadlock

System info:
$ g++ -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.2-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.2 (Debian 4.4.2-8)

Sorry that I didn't reduce a testcase, a trivial testcase with std::find and pthread doesn't hang.

There is also a problem with parallel mode when building LLVM, LLVM's tblgen makes very slow progress when multiple tblgen processes are executing in parallel on a multicore machine (see http://llvm.org/bugs/show_bug.cgi?id=5804)
Comment 1 Török Edwin 2010-01-05 18:09:54 UTC
(In reply to comment #0)
> $ make -j4

This should have been: make CCLD=g++ -j4
Comment 2 Paolo Carlini 2010-01-05 19:22:26 UTC
The best we can do is asking the attention of Johannes...
Comment 3 Paolo Carlini 2010-01-12 02:19:05 UTC
Johannes is looking into it, certainly reproducing the problem will not be a trivial taks, I'm afraid...
Comment 4 Török Edwin 2010-01-12 07:28:55 UTC
(In reply to comment #3)
> Johannes is looking into it,

Thanks.
> certainly reproducing the problem will not be a
> trivial taks, I'm afraid...
> 

If the steps I listed in the bugreport don't work for you just let me know for which step you need more info.
You can also ping me on #gcc (oftc.net), or #clamav (freenode.net).
Comment 5 Paolo Carlini 2010-01-12 11:54:02 UTC
Thanks. If you could do your best to figure out something small and self contained it would be great, otherwise we lack anyway something to add to the testsuite.
Comment 6 Johannes Singler 2010-01-12 12:36:30 UTC
Can I get this thing to run without actually installing it into the system?

5. clamd/clamd -c etc/clamd.conf
LibClamAV Error: cl_load(): Can't get status of /usr/local/share/clamav
ERROR: Can't get file status

Please enter the GCC version into the "Reported against" field.
What happens for OMP_NUM_THREADS=1?

I will look thoroughly into the find implementation in the meantime.
Comment 7 Török Edwin 2010-01-12 12:41:20 UTC
(In reply to comment #6)
> Can I get this thing to run without actually installing it into the system?
> 
> 5. clamd/clamd -c etc/clamd.conf
> LibClamAV Error: cl_load(): Can't get status of /usr/local/share/clamav
> ERROR: Can't get file status

Yes, you can specify the path.
A minimal example (you can use any path instead of /tmp):
$ mkdir /tmp/testdb
$ touch /tmp/testdb/foo.pdb
$ cat >etc/clamd.conf <<EOF
DatabaseDirectory /tmp/testdb
LocalSocket /tmp/clamd.socket
EOF
$ clamd/clamd -c etc/clamd/conf

Same for clamdscan (-c etc/clamd.conf)

> 
> Please enter the GCC version into the "Reported against" field.

Done.

> What happens for OMP_NUM_THREADS=1?

Will test now.

> 
> I will look thoroughly into the find implementation in the meantime.
> 

Ok.
Comment 8 Török Edwin 2010-01-12 12:51:38 UTC
(In reply to comment #7)
> > What happens for OMP_NUM_THREADS=1?
> 
> Will test now.

It doesn't hang with OMP_NUM_THREADS=1. It does hang with OMP_NUM_THREADS=2,
or with OMP_NUM_THREADS unset.

> > Please enter the GCC version into the "Reported against" field.
> 

I reproduced the issue with gcc version 4.3.2 (Debian 4.3.2-1.1) too.

BTW you can also find my build on gcc14 in the compiler farm at /home/edwin/clam/git_test/clamav-devel (should be world readable).
Comment 9 Török Edwin 2010-01-12 13:35:18 UTC
Could this bug be related to this one: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36242#c4

Clamd creates threads using pthread_create, std::find is called from those threads. There are also threads that only poll/dispatch, and never use the STL (hence never uses openmp). However the gcc manual doesn't mention incompatibility between pthread_create and openmp (or libstdc++ parallel mode).
Comment 10 Johannes Singler 2010-01-12 14:35:01 UTC
Can reproduce deadlock now.
Comment 11 Johannes Singler 2010-01-12 14:35:56 UTC
(In reply to comment #9)
> Could this bug be related to this one:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36242#c4

This bug is invalid for GCC 4.4.
 
> Clamd creates threads using pthread_create, std::find is called from those
> threads. There are also threads that only poll/dispatch, and never use the STL
> (hence never uses openmp). However the gcc manual doesn't mention
> incompatibility between pthread_create and openmp (or libstdc++ parallel mode).

It should work nevertheless.
Comment 12 Johannes Singler 2010-01-12 17:42:48 UTC
Thread 1 waits for its colleagues, but where are they gone?  Is it possible that an exception is thrown inside find (by means of the value type or the predicate)?
I don't fully trust gdb in this case, but it shows that an iterator range of (NULL, NULL) had to be searched.
Comment 13 Török Edwin 2010-01-12 17:54:17 UTC
(In reply to comment #12)
> Thread 1 waits for its colleagues, but where are they gone?  Is it possible
> that an exception is thrown inside find (by means of the value type or the
> predicate)?
> I don't fully trust gdb in this case, but it shows that an iterator range of
> (NULL, NULL) had to be searched.
> 

This code is compiled with -fno-exceptions, could that be a problem?
Comment 14 Johannes Singler 2010-01-13 13:53:11 UTC
(In reply to comment #13)

> This code is compiled with -fno-exceptions, could that be a problem?

No, that should rather help.

Still, it is very difficult to debug this.  Is there at least a way to access clamd's stdout and/or stderr?
Comment 15 Török Edwin 2010-01-13 20:39:24 UTC
(In reply to comment #14)
> (In reply to comment #13)
> 
> > This code is compiled with -fno-exceptions, could that be a problem?
> 
> No, that should rather help.
> 
> Still, it is very difficult to debug this.  Is there at least a way to access
> clamd's stdout and/or stderr?
> 

The usual way to debug clamd is by setting 'Foreground yes' in clamd.conf, however the bug doesn't reproduce then.

You can however still get stderr/stdout by applying the patch below, and starting clamd like this:
$ clamd/clamd -c etc/clamd.conf >stdout.log 2>stderr.log
or even without redirection:
$ clamd/clamd -c etc/clamd.conf

diff --git a/shared/misc.c b/shared/misc.c
index 080d4ec..656dda5 100644
--- a/shared/misc.c
+++ b/shared/misc.c
@@ -247,7 +247,7 @@ int daemonize(void)
        int fds[3], i;
        pid_t pid;

-
+#if 0
     fds[0] = open("/dev/null", O_RDONLY);
     fds[1] = open("/dev/null", O_WRONLY);
     fds[2] = open("/dev/null", O_WRONLY);
@@ -272,7 +272,7 @@ int daemonize(void)
     for(i = 0; i <= 2; i++)
        if(fds[i] > 2)
            close(fds[i]);
-
+#endif
     pid = fork();

     if(pid == -1)
Comment 16 Johannes Singler 2010-01-15 14:29:30 UTC
First, let's remove superfluous #pragma omp single in two occurences, to make things simpler (see attached patch for trunk).
The problem still persists, the program deadlocks.

When dropping in some prints (see attached patch), the log ends like this:

find going parallel, requesting 2 thread
thread 0 of 2 starts
thread 0 finished
thread 1 of 2 starts
thread 1 finished
successful join
find going parallel, requesting 2 thread
thread 0 of 2 starts
thread 0 finished

Analysis: Thread 1 never starts (or at least does not reach the first printf). In general, for more threads, only thread 0 starts.  This obviously leads to the deadlock.

So on first sight, I would blame it on the OpenMP implementation.  Maybe yet some interference with the pthreads.  Any other explanations?
Comment 17 Johannes Singler 2010-01-15 14:30:12 UTC
Created attachment 19616 [details]
Removes superfluous pragma omp single twice
Comment 18 Johannes Singler 2010-01-15 14:30:36 UTC
Created attachment 19617 [details]
Add printf debug statements.
Comment 19 Paolo Carlini 2010-01-15 14:35:59 UTC
Let's add Jakub in CC, he knows the implementation very well. In case, please keep also in touch privately.
Comment 20 Török Edwin 2010-01-27 12:35:51 UTC
Thanks to Jakub for the hints.

This is not a bug in libstdc++/gcc:
  the problem is that fork() is called when we already have threads (due to openmp/libstdc++ parallel mode), and then you can call a limited number of functions before exec().
std::find is called both before and after fork(). This is fine in a default build, but in a parallel mode build, the first std::find spawns threads, which ClamAV doesn't expect.

I will just make it a #error if ClamAV is built in libstdc++ parallel mode.