[PATCH, 0/4] Handle GOMP_OPENACC_NVPTX_{DISASM,SAVE_TEMPS,JIT} in libgomp nvptx plugin

Tom de Vries Tom_deVries@mentor.com
Mon Jun 26 11:24:00 GMT 2017


Hi,

I've written a patch series to facilitate debugging libgomp openacc 
testcase failures on the nvptx accelerator.


When running an openacc test-case on an nvptx accelerator, the following 
happens:
- the plugin obtains the ptx assembly for the acceleration kernels
- it calls the cuda jit to compile and link the ptx into a module
- it loads the module
- it starts an acceleration kernel

The patch series adds these environment variables:
- GOMP_OPENACC_NVPTX_SAVE_TEMPS: a means to save the resulting module
   such that it can be investigated using nvdisasm and cuobjdump.
- GOMP_OPENACC_NVPTX_DISASM: a means to see the resulting module in
   the debug output,  by writing it into a file and calling nvdisasm on
   it
- GOMP_OPENACC_NVPTX_JIT: a means to set parameters of the
   compilation/linking process, currently supporting:
   * -O[0-4], mapping onto CU_JIT_OPTIMIZATION_LEVEL
   * -ori, mapping onto CU_JIT_NEW_SM3X_OPT


The patch series consists of these patches:

1. Show value of GOMP_OPENACC_DIM in libgomp nvptx plugin
2. Handle GOMP_OPENACC_NVPTX_{DISASM,SAVE_TEMPS} in libgomp nvptx plugin
3. Handle GOMP_OPENACC_NVPTX_JIT=-O[0-4] in libgomp nvptx plugin
4. Handle GOMP_OPENACC_NVPTX_JIT=-ori in libgomp nvptx plugin


I've tested the patch series on top of gomp-4_0-branch, by running an 
openacc testcase from the command line and defining the various 
environment variables.

[ A relevant difference between gomp-4_0-branch and master is that:
- master defines and includes ./libgomp/plugin/cuda/cuda.h, so I had to
   add the CU_JIT constants there, while
- gomp-4_0-branch doesn't define that local minimal cuda.h file but
   includes cuda's cuda.h. My setup linked against cuda 6.5 which defines
   CU_JIT_OPTIMIZATION_LEVEL but not yet CU_JIT_NEW_SM3X_OPT (that seems
   to have been introduced at cuda 8.0), so I had to hardcode the latter.
]


OK for trunk if bootstrap and reg-test on x86_64 with nvidia accelerator 
succeeds?

Thanks,
- Tom



More information about the Gcc-patches mailing list