[Bug target/96932] [nvptx] atomic_exchange missing barrier

vries at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed May 12 12:07:28 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #3)
> Crossref: PR100497 - fails on Volta without
>   membar.sys;
> before
>   atom.global.exch.b32
> 
> Unfortunately, compared to pre-Volta, it is very slow - membar.gl is still
> slow but a bit less.  Using (→ sm_70) fence.sys / fence.gnu instead of
> fence.sc.{sys,gnu} (= membar.{sys,gl} on >= sm_70) does not seem to make a

fence.sc.gpu, funny typo :)

> performance difference for PR100497.

The GOMP_atomic_start/GOMP_atomic_end are fallbacks, and unfortunately cannot
be expected to be too optimal.

Following the introduction of -mptx=6.3 we can add support for atom.cas.b16
(well, once we also introduce misa=sm_70), and that should be the optimal
solution.


More information about the Gcc-bugs mailing list