[PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry

Tue Aug 7 13:53:00 GMT 2018

On 08/06/2018 11:08 PM, Tom de Vries wrote:
> On 08/01/2018 12:18 PM, Tom de Vries wrote:
> 
>> I think we need to add and handle:
>> ...
>>   CUDA_ONE_CALL_MAYBE_NULL (cuOccupancyMaxPotentialBlockSize)
>> ...
>>
> 
> I realized that the patch I posted introducing CUDA_ONE_CALL_MAYBE_NULL
> was incomplete, and needed to use the weak attribute in case of linking
> against a concrete libcuda.so.
> 
> So, I've now committed a patch implementing just CUDA_ONE_CALL_MAYBE_NULL:
> "[libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL" @
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00447.html . You can use
> "CUDA_CALL_EXISTS (cuOccupancyMaxPotentialBlockSize)" to test for
> existence of the function in the cuda driver API.

Sorry for taking so long getting this patch updated. It's a slow build
and test cycle getting older versions of cuda to play nicely. So far,
I've managed to get CUDA 5.5 partially working with Nvidia driver
331.113 (which supports CUDA 6.0) in the sense that I spotted an error
with the patch; I realized that the cuda.h that ships with libgomp
emulates version CUDA 8.0. That lead to problems using cuLinkAddData,
because that function gets remapped to cuLinkAddData_v2 in CUDA 6.5 and
newer.

That leads me to a question, do we really want to support older versions
of CUDA without using the system's CUDA header files?

>> The patch doesn't build in a setup with
>> --enable-offload-targets=nvptx-none and without cuda, that enables usage
>> of plugin/cuda/cuda.h:
>> ...
>> /data/offload-nvptx/src/libgomp/plugin/plugin-nvptx.c:98:16: error:
>> â€˜cuOccupancyMaxPotentialBlockSizeâ€™ undeclared here (not in a function);
>> did you mean â€˜cuOccupancyMaxPotentialBlockSizeWithFlagsâ€™?
>>  CUDA_ONE_CALL (cuOccupancyMaxPotentialBlockSize) \
>> ...
>>
> 
> I've committed a patch "[libgomp, nvptx, --without-cuda-driver] Don't
> use system cuda driver" @
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00348.html .
> 
> Using --without-cuda-driver should make it easy to build using the
> dlopen interface without having to de-install the system libcuda.so.

I attached an updated version of the CUDA driver patch, although I
haven't rebased it against your changes yet. It still needs to be tested
against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
you an update.

Does this patch look OK, at least after testing competes? I removed the
tests for CUDA_ONE_CALL_MAYBE_NULL, because the newer CUDA API isn't
supported in the older drivers.

Cesar

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-nvptx-Use-CUDA-driver-API-to-select-default-runtime-.patch
Type: text/x-patch
Size: 5996 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20180807/45ca0ceb/attachment.bin>