Stage3 closing soon, call for patch pings

Fri Jan 16 07:48:00 GMT 2015

On 01/15/15 16:43, Nathaniel Smith wrote:
>>
>> Jakub, myself and management have discussed this issue extensively and those
>> patches specifically.  I'm painfully aware of how this affects the ability
>> to utilize numerical packages in Python.
>
> Thanks for the response! I had no idea anyone was paying attention :-).
We've got customers that care about this issue, so naturally it gets a 
goodly amount of attention up into the management chain.

>
>> The fundamental problem is you're penalizing conformant code to make
>> non-conformant code work.  In the last iteration I think the problem was
>> major memory leakage and nobody could see a way to avoid that.
>
> I'm afraid I'm a bit confused by this part.
I'm going from memory rather than the patch itself (Jakub understands 
the technical side of this issue far better than I).

Jakub, can you chime in on Nathaniel's clarifications below?  If the 
leak is strictly in non-comformant code, that seems much less of a 
problem than I recall from our internal discussions.

>
> In the patch I linked, the costs imposed on conformant programs are:
>
> 1) One extra 'int' per thread pool.
> 2) An increment/decrement pair during each call to fork().
> 3) A single 'if (__builtin_expect(..., 0)) { ... }' in gomp_team_start.
>
> That's all. There is definitely no memory leakage for conformant code.
>
> There *is* a memory leakage for non-conformant code: if you use OMP in
> the parent, then fork, and then use OMP in the child, then without the
> patch you get a deadlock; with the patch everything functions
> correctly, but the child's COW copies of the parent's thread stacks
> are leaked.
>
> There are a few things that somewhat mitigate this in practice.
>
> The main one is just: Leaking is a lot better than crashing. Esp. when
> the only other way to fix the crash is to completely rewrite your code
> (or some third-party code!) to avoid OMP, which is rather prohibitive.
> Of course I'd rather not leak at all, but this isn't really a
> situation where one can say "well that was a stupid idea so don't do
> that" -- a memory leak here really is a better outcome than any
> currently available alternative, enables use cases that just aren't
> possible otherwise, and if OMP fixes its problems later then we can
> update our fix to follow.
>
> It's also worth thinking a bit about the magnitude of the problem: In
> practice the most common case where an OMP-using program forks and
> then the children use OMP is going to be something like Python
> multiprocessing, where the child processes form a worker pool. In this
> case, the total effective memory leakage is somewhere between 0 and 1
> copies of the parents threads -- all the children will share a single
> COW copy, and the parent may share some or all of that COW copy as
> well, depending on its OMP usage. Conversely, it's very rare to have a
> process that forks, then the child uses OMP then forks, then the
> grandchild uses OMP then forks, ... etc. Yet this is what would be
> required to get an unbounded memory leak here. If you only have a
> single level of forking, then each child only has a single leak of
> fixed size. There are many many situations where a few megabytes of
> overhead are a small price to pay for having a convenient way to
> process large datasets.
>
> Does this affect your opinion about this patch, or am I better off
> giving up now?
>
> -n
>