Nvptx GPU offloading using OpenMP4 and GCC 7.2

Caspar van Leeuwen caspar.vanleeuwen@surfsara.nl
Wed Sep 6 15:49:00 GMT 2017


Hi, 

I have been trying to install the GCC 7.2 compiler with offload capabilities for nvptx, but so far, unsuccesful. 

I mainly based myself on https://gcc.gnu.org/wiki/Offloading and https://kristerw.blogspot.nl/2017/04/building-gcc-with-support-for-nvidia.html. The script I finally used for compilation is attached (compileScript.sh) - for the sake of understanding the script, note that the module load cuda/8.0.61 sets the CUDA_PATH variable, you can ignore the #SBATCH lines, which are for compilation from a batch job. 

I've managed to compile the nvptx-tools, GCC-nvptx and GCC-host compilers are without errors. However, when I compile a minimal example of a for loop distributed using an openMP4 "#pragma omp target" statement (gcc -fopenmp -o openMP_GPU_minimal openMP_GPU_minimal.c), the compiler returns the following error: 

gcc: warning: '-x lto' after last input file has no effect 
gcc: fatal error: no input files 

Attached you'll find the compiler output I get with the -v option, it may be more informative than the rather vague warning above. 

I found this thread https://gcc.gnu.org/ml/gcc-help/2016-04/msg00111.html that deals with the exactly the same issue, but the suggestion to install everything (nvptx-tools, gcc-host and gcc-accelerator compilers) in the same <something>/install directory didn't help me: I already did that to begin with, as it is suggested by the script of kristerw. 

Another thing I noticed is that if I add the -flto option, the compilation completes without errors. However, when I then run omp_is_initial_device() inside the #pragma omp target region, it returns 'true', indicating that the code is running on the host device, and not on the accelerator (GPU), as intended. Note that omp_get_num_devices() correctly returns 2 (there are 2 GPUs in the system), but I don't think this tells me anything regarding if I can succesfully offload code: I believe omp_get_num_devices() is just host code, defined in the libgomp.so. So at best, it tells me that I'm using a libgomp.so that supports detecting these accelerators. 

For the sake of completeness, let me also include the output of gcc -v for the host and accelerator compilers, so you can check if that makes sense. 

For the host compiler (gcc or x86_64-pc-linux-gnu-gcc, both return the same): 
Using built-in specs. 
COLLECT_GCC=/home/casparl/GCC_with_nvptx/work/install/bin/gcc 
COLLECT_LTO_WRAPPER=/nfs/home4/casparl/GCC_with_nvptx/work/install/bin/../libexec/gcc/x86_64-pc-linux-gnu/7.2.0/lto-wrapper 
OFFLOAD_TARGET_NAMES=nvptx-none 
Target: x86_64-pc-linux-gnu 
Configured with: ../gcc-7.2.0/configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-offload-targets=nvptx-none=/home/casparl/GCC_with_nvptx/work/install --with-cuda-driver-include=/hpc/sw/cuda/8.0.61//include --with-cuda-driver-lib=/hpc/sw/cuda/8.0.61//lib64 --disable-multilib --enable-languages=c,c++,fortran,lto --prefix=/home/casparl/GCC_with_nvptx/work/install 
Thread model: posix 
gcc version 7.2.0 (GCC) 

For the accelerator compiler (x86_64-pc-linux-gnu-accel-nvptx-none-gcc -v): 
Using built-in specs. 
COLLECT_GCC=x86_64-pc-linux-gnu-accel-nvptx-none-gcc 
COLLECT_LTO_WRAPPER=/nfs/home4/casparl/GCC_with_nvptx/work/install/bin/../libexec/gcc/x86_64-pc-linux-gnu/7.2.0/accel/nvptx-none/lto-wrapper 
Target: nvptx-none 
Configured with: ../gcc-7.2.0/configure --target=nvptx-none --with-build-time-tools=/home/casparl/GCC_with_nvptx/work/install/nvptx-none/bin --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-sjlj-exceptions --enable-newlib-io-long-long --disable-multilib --enable-languages=c,c++,fortran,lto --prefix=/home/casparl/GCC_with_nvptx/work/install 
Thread model: single 
gcc version 7.2.0 (GCC) 

I'm afraid I don't have enough insight into what the gcc warning indicates (e.g. if the problem is with the options of my host or accelerator compilers, or with which compilers/linkers are used, etc). Any help to get me going is greatly appreciated, because I've exhausted all potential solutions I could think off (and I'd love to give openMP offloading a try!). 

Cheers, 

Caspar van Leeuwen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compileScript.sh
Type: application/x-shellscript
Size: 2164 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-help/attachments/20170906/c8004f82/attachment.bin>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: compilerOutput.txt
URL: <https://gcc.gnu.org/pipermail/gcc-help/attachments/20170906/c8004f82/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openMP_GPU_minimal.c
Type: text/x-c++src
Size: 1053 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-help/attachments/20170906/c8004f82/attachment-0001.bin>


More information about the Gcc-help mailing list