[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

Tobias Burnus tobias@codesourcery.com
Sun Oct 2 18:13:56 GMT 2022

On 27.09.22 11:23, Tobias Burnus wrote:

We do support
  #if __PTX_SM__ >= 600  (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
  --with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.

See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)

Regarding this:

While 'atom.op' (op = and/or/xor/cas/exch/add/inc/dec/min/max)
with scope is a sm_60 feature, the used 'st/ld' with scope qualifier
and .relaxed, .release / .relaxed, .acquire require sm_70.

(Does not really matter as only ..., sm_53 and sm_70, ... is currently
supported but not sm_60, but the #if should be obviously fixed.)

 * * *

Looking at the generated code for without inline assembler, we have instead of
  st.global.release.sys.u64 [%r27],%r39;
  ld.acquire.sys.global.u64 %r62,[%r27];
for the older-systems (__PTX_SM < 700) the code:
  @ %r69 membar.sys;
  @ %r69 atom.exch.b64 _,[%r27],%r41;
  ld.global.u64 %r64,[__gomp_rev_offload_var];
  ld.u64 %r36,[%r64];

In my understanding, the membar.sys ensures - similar to
  st.release / ld.acquire
that the memory handling is done in the correct order in scope .sys.
As the 'fn' variable is initially 0 - and then only set via the device
i.e. there is eventually a DMA write device->host, which is atomically
as the will int64_t is written at once (and not first, e.g. the lower
and then the upper half). The 'st'/'atom.exch' should work fine, despite
having no .sys scope.

Likewise, the membar.sys applies also in the other direction. Or did I
miss something. If so, would an explicit __sync_synchronize() (= membar.sys)
help between the 'st' and the 'ld'?


Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

More information about the Gcc-patches mailing list