[Bug target/96932] [nvptx] atomic_exchange missing barrier
vries at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed May 12 12:07:28 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932
--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #3)
> Crossref: PR100497 - fails on Volta without
> membar.sys;
> before
> atom.global.exch.b32
>
> Unfortunately, compared to pre-Volta, it is very slow - membar.gl is still
> slow but a bit less. Using (→ sm_70) fence.sys / fence.gnu instead of
> fence.sc.{sys,gnu} (= membar.{sys,gl} on >= sm_70) does not seem to make a
fence.sc.gpu, funny typo :)
> performance difference for PR100497.
The GOMP_atomic_start/GOMP_atomic_end are fallbacks, and unfortunately cannot
be expected to be too optimal.
Following the introduction of -mptx=6.3 we can add support for atom.cas.b16
(well, once we also introduce misa=sm_70), and that should be the optimal
solution.
More information about the Gcc-bugs
mailing list