[Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling

Tobias Burnus tobias@codesourcery.com
Wed Oct 12 08:55:26 GMT 2022


On 11.10.22 13:12, Alexander Monakov wrote:
> My understanding is such trickery should not be necessary with
> the barrier-based approach, i.e. the sequence of PTX instructions
>
>    st   % plain store
>    membar.sys
>    st.volatile
>
> should be enough to guarantee that the former store is visible on the host
> before the latter, and work all the way back to sm_20.

If I understand it correctly, you mean:

   GOMP_REV_OFFLOAD_VAR->dev_num = GOMP_ADDITIONAL_ICVS.device_num;

   __sync_synchronize ();  /* membar.sys */
   asm volatile ("st.volatile.global.u64 [%0], %1;"
                 : : "r"(addr_struct_fn), "r" (fn) : "memory");


And then directly followed by the busy wait:

   while (__atomic_load_n (&GOMP_REV_OFFLOAD_VAR->fn, __ATOMIC_ACQUIRE) != 0)
     ;  /* spin  */

which GCC expands to:

   /* ld.global.u64 %r64,[__gomp_rev_offload_var];
      ld.u64 %r36,[%r64];
      membar.sys;  */

The such updated patch is attached.

(This is the only change + removing the mkoffload.cc part is the only
larger change. Otherwise, it only handles the minor comments by Jakub.
The now removed CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT was used
until commit r10-304-g1f4c5b9bb2eb81880e2bc725435d596fcd2bdfef i.e.
it is a really old left over!)

Otherwise, tested* to work with sm_30 (error by mkoffload, unchanged),
sm_35 and sm_70.

Tobias

*With some added code; until GOMP_OFFLOAD_get_num_devices accepts
GOMP_REQUIRES_UNIFIED_SHARED_MEMORY and GOMP_OFFLOAD_load_image
gets passed a non-NULL for rev_fn_table, the current patch is a no op.

Planned next is the related GCN patch – and the actual change
in libgomp/target.c (+ accepting USM in GOMP_OFFLOAD_get_num_devices)
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rev-offload-run-nvptx-v6.diff
Type: text/x-patch
Size: 19308 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20221012/0e8a0793/attachment-0001.bin>


More information about the Gcc-patches mailing list