[PATCH, og8] Add OpenACC 2.6 `acc_get_property' support

Chung-Lin Tang chunglin_tang@mentor.com
Mon Dec 10 09:06:00 GMT 2018


On 2018/12/6 2:16 AM, Maciej W. Rozycki wrote:
>> AFAIK, things like strlen are already available in iso_c_binding, in forms
>> like "C_strlen".
>> Can you check again if that 'openacc_c_string' module is really necessary?
>   Any pointers please?
> 
>   I can't see `c_strlen' or any equivalent interface defined either in the
> Fortran 2003 language standard or in GCC documentation, and neither `grep'
> over the GCC tree shows anything relevant.  The `iso_c_binding' module
> defines only a bunch of procedures according to said documentation.  The
> `strlen' function provided here has been taken from one of our Fortran
> test cases, which strongly indicates there's no such API already available
> or whoever wrote the test case would have chosen to use it I suppose.

Okay I see. I think I mixed up the common convention with the actual interface
standard.

>>> +	    CUcontext new_ctx;
>>> +
>>> +	    CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO,
>>> +			    dev);
>>> +	    CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
>>> +	    CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx);
>>> +	  }
>> (I'm CCing Tom here, as he is maintainer for these parts)
>>
>> As we discussed earlier on our internal list, I think properly using
>> GOMP_OFFLOAD_init_device
>> is the right way, instead of using the lower level CUDA context
>> create/destroy.
>>
>> I did not mean for you to first init the device and then immediately destroy
>> it by
>> GOMP_OFFLOAD_fini_device, just to obtain the property, but for you to just
>> take the opportunity to initialize
>> it for use, and leave it there until program exit. That should save resources
>> overall.
>> (BTW, CUDA contexts should be quite expensive to create/destroy, using a
>> cuCtxCreate/Destroy pair is probably
>> almost as slow)
>   I have argued that this looks like a corner-case use case to me, as
> querying for the remaining (rather than total) memory available to a
> device that hasn't been (yet) used looks like of hardly any use to me,
> because obviously at such a stage no memory has been used.  The OpenACC
> standard does require us to handle such a request somehow, with returning
> 0 being another option, however I thought we may well have a quick peek
> without pulling in all the state.
> 
>   I guess I have no strong opinion either way and I can adapt accordingly.
> 
>   NB that would have to be `gomp_init_device' rather than
> `GOMP_OFFLOAD_init_device' AFAICS.

You'll have to use GOMP_OFFLOAD_init_device, as you are inside the plugin, gomp_init_device()
should not be available.

However, looking into this further, the checking conventions of GOMP_OFFLOAD_init_device
will have to be slightly tweaked to accommodate possible further initing from libgomp proper,
so this may requirement a longer string of changes...I think it's not worth it, or can
be adjusted later. I now think your current approach with the CUDA contexts is okay.

I think the patch is okay, although still needs approval from Thomas and Tom to commit.

Thanks,
Chung-Lin



More information about the Gcc-patches mailing list