[PATCH, og8] Add OpenACC 2.6 `acc_get_property' support
Chung-Lin Tang
chunglin_tang@mentor.com
Mon Dec 10 09:06:00 GMT 2018
On 2018/12/6 2:16 AM, Maciej W. Rozycki wrote:
>> AFAIK, things like strlen are already available in iso_c_binding, in forms
>> like "C_strlen".
>> Can you check again if that 'openacc_c_string' module is really necessary?
> Any pointers please?
>
> I can't see `c_strlen' or any equivalent interface defined either in the
> Fortran 2003 language standard or in GCC documentation, and neither `grep'
> over the GCC tree shows anything relevant. The `iso_c_binding' module
> defines only a bunch of procedures according to said documentation. The
> `strlen' function provided here has been taken from one of our Fortran
> test cases, which strongly indicates there's no such API already available
> or whoever wrote the test case would have chosen to use it I suppose.
Okay I see. I think I mixed up the common convention with the actual interface
standard.
>>> + CUcontext new_ctx;
>>> +
>>> + CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO,
>>> + dev);
>>> + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
>>> + CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx);
>>> + }
>> (I'm CCing Tom here, as he is maintainer for these parts)
>>
>> As we discussed earlier on our internal list, I think properly using
>> GOMP_OFFLOAD_init_device
>> is the right way, instead of using the lower level CUDA context
>> create/destroy.
>>
>> I did not mean for you to first init the device and then immediately destroy
>> it by
>> GOMP_OFFLOAD_fini_device, just to obtain the property, but for you to just
>> take the opportunity to initialize
>> it for use, and leave it there until program exit. That should save resources
>> overall.
>> (BTW, CUDA contexts should be quite expensive to create/destroy, using a
>> cuCtxCreate/Destroy pair is probably
>> almost as slow)
> I have argued that this looks like a corner-case use case to me, as
> querying for the remaining (rather than total) memory available to a
> device that hasn't been (yet) used looks like of hardly any use to me,
> because obviously at such a stage no memory has been used. The OpenACC
> standard does require us to handle such a request somehow, with returning
> 0 being another option, however I thought we may well have a quick peek
> without pulling in all the state.
>
> I guess I have no strong opinion either way and I can adapt accordingly.
>
> NB that would have to be `gomp_init_device' rather than
> `GOMP_OFFLOAD_init_device' AFAICS.
You'll have to use GOMP_OFFLOAD_init_device, as you are inside the plugin, gomp_init_device()
should not be available.
However, looking into this further, the checking conventions of GOMP_OFFLOAD_init_device
will have to be slightly tweaked to accommodate possible further initing from libgomp proper,
so this may requirement a longer string of changes...I think it's not worth it, or can
be adjusted later. I now think your current approach with the CUDA contexts is okay.
I think the patch is okay, although still needs approval from Thomas and Tom to commit.
Thanks,
Chung-Lin
More information about the Gcc-patches
mailing list