[gomp4 00/14] NVPTX: further porting

Wed Oct 21 09:17:00 GMT 2015

On Wed, 21 Oct 2015, Jakub Jelinek wrote:
> > time (libcudadevrt.a), and imposes overhead at run time.  The last point might
> 
> But if this is the case, that is really serious issue.  Is that really
> something that isn't available in a shared library?
> E.g. with my distro GCC maintainer hat on, I'd really like to tweak the
> libgomp PTX plugin, so that it compiles against a stub cuda.h header and
> doesn't like against libcuda*.so at all, but instead dlopens it, to avoid
> hard dependencies on the non-free CUDA stuff and more importantly any link
> time dependencies on that.  If libcudadevrt is not
> available as shared library, this wouldn't of course work.  Would be nice to
> talk to NVidia about this...

It's a library of device (PTX) code, not host code, so dynamic linking does
not apply.

> > libgomp.c/thread-limit-2.c: fails to link due to 'usleep' unavailable on
> > NVPTX.  Note, the test does not run anything on the device because the target
> > region has 'if (0)' clause.
> 
> As optimization, perhaps we could avoid adding the "omp target entrypoint"
> attribute for the body of if(0) target region, that one always goes to host
> fallback, so no offloaded code is needed.
> 
> As for other tests, XFAILing them always is undesirable, supposedly we could
> add a dejagnu target check whether the default target goes to PTX (if we
> don't have it already) and use that to xfail?

Yes, that's what I meant; such a check is already implemented for OpenACC.

> Of course that doesn't help the thread-limit-2.c testcase.

Why not?

Alexander