[PATCH] libgomp, OpenMP, nvptx: Low-latency memory allocator

Tom de Vries tdevries@suse.de
Thu Jan 6 09:29:29 GMT 2022


On 1/5/22 15:36, Andrew Stubbs wrote:
> On 05/01/2022 13:04, Tom de Vries wrote:
>> On 1/5/22 12:08, Tom de Vries wrote:
>>> The allocators-1.c test-case doesn't compile because:
>>> ...
>>> FAIL: libgomp.c/allocators-1.c (test for excess errors)
>>> Excess errors:
>>> /home/vries/oacc/trunk/source-gcc/libgomp/testsuite/libgomp.c/allocators-1.c:7:22: 
>>> sorry, unimplemented: '    ' clause on 'requires' directive not 
>>> supported yet
>>>
>>> UNRESOLVED: libgomp.c/allocators-1.c compilation failed to produce 
>>> executable
>>> ...
>>>
>>> So, I suppose I need "[PATCH] OpenMP front-end: allow requires 
>>> dynamic_allocators" as well, I'll try again with that applied.
>>
>> After applying that, I get:
>> ...
>> WARNING: program timed out.
>> FAIL: libgomp.c/allocators-2.c execution test
>> WARNING: program timed out.
>> FAIL: libgomp.c/allocators-3.c execution test
>> ...
> 
> It works for me.....
> 
> Those tests are doing some large number of allocations repeatedly and in 
> parallel to stress the atomics. They're also slightly longer running 
> than the other tests.
>    - allocators-2 calls omp_alloc 8080 times, over 16 kernel launches, 
> some of which will fall back to PTX malloc.

I've minimized the test-case by enabling a single call in main at a 
time.  All but the last 4 take about two seconds, the last 4 hang (and 
time out at 5min).

So, this already times out for me:
...
int
main ()
{
   test (1000, omp_low_lat_mem_alloc);
   return 0;
}
...

I tried playing around with the n, and roughly there's no hang below 
100, and a hang above 200, and inbetween there may or may not be a hang.

Again the same dynamic: if there's no hang, it just takes a few seconds.

>    - allocators-3 calls omp_alloc and omp_free 8 million times each, 
> over 8 kernel launches, and takes about a minute to run on my device 
> (whether that falls back depends entirely on how the free calls 
> interleave).
> 
> Either there is a flaw in the concurrency causing some kind of deadlock, 
> or else your timeout is set too short for your device. I hope it's the 
> latter. We may need to tweak this.

At first glance, the above behaviour doesn't look like a too short timeout.

[ FTR, I'm using a GT 1030 with production branch driver version 470.86 
(which is one version behind the latest 470.94) ]

Thanks,
- Tom


More information about the Gcc-patches mailing list