[PATCH 0/8] NVPTX offloading to NVPTX: backend patches

Bernd Schmidt bschmidt@redhat.com
Wed Oct 19 10:39:00 GMT 2016

On 10/18/2016 06:58 PM, Alexander Monakov wrote:
> The currently published OpenMP version of LULESH simply doesn't use openmp-simd
> anywhere. This should make it obvious that it won't be anywhere near any
> reasonable CUDA implementation, and also bound to be below host performance.
> Besides, it's common for such benchmark suites to have very different levels of
> hand tuning for the native-CUDA implementation vs OpenMP implementation,
> sometimes to the point of significant algorithmic differences. So you're
> making an invalid comparison here.

The information I have is that the LULESH code is representative of how 
at least some groups on the HPC side expect to write OpenMP code. It's 
the biggest real-world piece of code that I'm aware of that's available 
for testing, so it seemed like a good thing to try. If you have other 
real-world tests available, please let us know. If you can demonstrate 
good performance by modifying LULESH sources, that would also be a good 
step, although maybe not the ideal case. But I think it's not 
unreasonable to look for a demonstration that reasonable performance is 
achievable on something that isn't just a microbenchmark.

I'll refrain from any further comments on the topic. The ptx patches 
don't look unreasonable iff someone else decides that this version of 
OpenMP support should be merged and I'll look into them in more detail 
if that happens. Patch 2/8 is ok now.


More information about the Gcc-patches mailing list