[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling
Tobias Burnus
tobias@codesourcery.com
Sun Oct 2 18:13:56 GMT 2022
On 27.09.22 11:23, Tobias Burnus wrote:
We do support
#if __PTX_SM__ >= 600 (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
--with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.
See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)
Regarding this:
While 'atom.op' (op = and/or/xor/cas/exch/add/inc/dec/min/max)
with scope is a sm_60 feature, the used 'st/ld' with scope qualifier
and .relaxed, .release / .relaxed, .acquire require sm_70.
(Does not really matter as only ..., sm_53 and sm_70, ... is currently
supported but not sm_60, but the #if should be obviously fixed.)
* * *
Looking at the generated code for without inline assembler, we have instead of
st.global.release.sys.u64 [%r27],%r39;
and
ld.acquire.sys.global.u64 %r62,[%r27];
for the older-systems (__PTX_SM < 700) the code:
@ %r69 membar.sys;
@ %r69 atom.exch.b64 _,[%r27],%r41;
and
ld.global.u64 %r64,[__gomp_rev_offload_var];
ld.u64 %r36,[%r64];
membar.sys;
In my understanding, the membar.sys ensures - similar to
st.release / ld.acquire
that the memory handling is done in the correct order in scope .sys.
As the 'fn' variable is initially 0 - and then only set via the device
i.e. there is eventually a DMA write device->host, which is atomically
as the will int64_t is written at once (and not first, e.g. the lower
and then the upper half). The 'st'/'atom.exch' should work fine, despite
having no .sys scope.
Likewise, the membar.sys applies also in the other direction. Or did I
miss something. If so, would an explicit __sync_synchronize() (= membar.sys)
help between the 'st' and the 'ld'?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
More information about the Gcc-patches
mailing list