User account creation filtered due to spam.

Bug 58378 - Protect libgomp against child process hanging after a Unix fork()
Summary: Protect libgomp against child process hanging after a Unix fork()
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: libgomp (show other bugs)
Version: 4.8.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-10 09:54 UTC by Olivier Grisel
Modified: 2013-09-10 11:11 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Patch to protect libgomp thread pool against fork() (1.23 KB, patch)
2013-09-10 09:54 UTC, Olivier Grisel
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Grisel 2013-09-10 09:54:59 UTC
Created attachment 30784 [details]
Patch to protect libgomp thread pool against fork()

The problem is discussed in [1]. To summarize if a process uses OpenMP
features and then calls fork, the threads from the OpenMP thread pool
of the parent process are not copied to the child process (which is
expected). Then later if the child process uses some OpenMP feature
again it will hang, waiting for threads that don't exist in its own
process.

In practice this can often happen in Python programs that import
modules that use OpenMP internally while also using the
`multiprocessing` module. This module is in the Python standard
library and uses Unix fork for handling multi-core concurrency
efficiently at the Python level.

I attach the patch to this report and a test that checks that the fix actually works. The patch can also be visualized on this github branch [2].

When running the example snippet from [1] saved in a file called
`openmp_fork.c` I get the expected output:

    $ gcc-head -fopenmp -o openmp_fork openmp_fork.c && ./openmp_fork
    para_a
    para_a
    a ended
    para_b
    para_b
    b ended

instead of a hanging process.


[1] http://bisqwit.iki.fi/story/howto/openmp/#OpenmpAndFork
[2] https://github.com/ogrisel/gcc/compare/forksafe-omp-pthread_atfork

Note that the OpenMP implementation of ICC does not hang either when using fork.

Note 2: this problem is related to (a duplicate of)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52303 which was deemed
invalid as the POSIX standard states that it's unsafe to use threading after a fork and prior to calling exec. However as the `pthread_atfork` protection from this patch is rather non-invasive and should not impact any process not initializing the libgomp runtime prior to a fork. Interpreted language implementations such as CPython that (ab)use Unix fork for efficient concurrency would really benefit from such a protection against libraries using OpenMP internally with the caller not necessarily being aware of it.
Comment 1 Jakub Jelinek 2013-09-10 10:00:51 UTC
The PR52303 comment about what you are trying to do being invalid of course applies here too, and the patch isn't anything close to non-invasive, it is just wrong, because it will break valid OpenMP programs.
Comment 2 Olivier Grisel 2013-09-10 10:17:44 UTC
What kind of breakage does this introduce? It's a real question, I am not an experienced OpenMP developer.

Do you see any solution that would prevent libgomp-based programs such as mentioned in [1] to not hang after a fork().

Is it worth for me to spend time trying to work a better and safer solution (maybe with some guidance)?
Comment 3 Jakub Jelinek 2013-09-10 10:27:14 UTC
Please read OpenMP 4.0, section 2.14.2 (threadprivate directive), or corresponding sections in older standards.
The implementation must preserve values of threadprivate variables in certain cases, which your hack violates.  If somebody (validly) does:
int v;
#pragma omp threadprivate (v)
...
omp_set_dynamic (0);
#pragma omp parallel num_threads (4)
{
  v = omp_get_thread_num () * 16;
}
...
pid = fork ();
if (pid == 0)
  {
    /* Valid fork child, only calling functions POSIX allows it to.  */
    execve (...);
  }

#pragma omp parallel num_threads (4)
{
  if (v != omp_get_thread_num () * 16)
    abort ();
}

then the implementation must preserve the threadprivate values, but with your patch all the threads but the initial one will be lost during fork and thus v will be 0 instead of the desired value later on.

There is no point trying to hack around bugs in your code inside of libgomp, simply follow the requirements how you can use fork in multithreaded apps.
Comment 4 Olivier Grisel 2013-09-10 10:51:31 UTC
Thanks for the explanation. Would you consider a solution that would preserve the state of the parent process and would just reset the thread pool data on the child?

Otherwise we will have to consider that the way fork() is used in Python's multiprocessing module is really an abuse and that there is no way to safely use both openmp-libraries and Python multiprocessing in the same program safely.
Comment 5 Jakub Jelinek 2013-09-10 10:59:34 UTC
Having a pthread_atfork child hook that would do freeing of memory, or pthread_mutex_init etc. would only make invalid any OpenMP program using fork, even those that use it correctly.
Comment 6 Olivier Grisel 2013-09-10 11:11:26 UTC
Alright thanks again. For reference I just discovered that the issue has recently been fixed in Python 3.4 by adding a new `forkserver` option to multiprocessing.

  http://bugs.python.org/issue8713