[PATCH 1/6, OpenACC, libgomp] Async re-work, interfaces

Chung-Lin Tang chunglin_tang@mentor.com
Mon Dec 17 10:27:00 GMT 2018


On 2018/12/15 1:52 AM, Thomas Schwinge wrote:
>> --- a/include/gomp-constants.h
>> +++ b/include/gomp-constants.h
> 
>> @@ -199,7 +200,7 @@ enum gomp_map_kind
>>   /* Versions of libgomp and device-specific plugins.  GOMP_VERSION
>>      should be incremented whenever an ABI-incompatible change is introduced
>>      to the plugin interface defined in libgomp/libgomp.h.  */
>> -#define GOMP_VERSION	1
>> +#define GOMP_VERSION	2
>>   #define GOMP_VERSION_NVIDIA_PTX 1
>>   #define GOMP_VERSION_INTEL_MIC 0
>>   #define GOMP_VERSION_HSA 0
> 
> OK, I think -- but I'm never quite sure whether we do need to increment
> "GOMP_VERSION" when only doing libgomp-internal libgomp-plugin changes,
> which don't affect the user/GCC side?
> 
> GCC encodes "GOMP_VERSION" in "GOMP_offload_register_ver" calls
> synthesized by "mkoffload": "GOMP_VERSION_PACK (/* LIB */ GOMP_VERSION,
> /* DEV */ GOMP_VERSION_NVIDIA_PTX)", and then at run time libgomp checks
> in "GOMP_offload_register_ver", so that we don't try to load offloading
> code with an _old_ libgomp that has been compiled with/for the _new_
> version.  (Right?)
> 
>      void
>      GOMP_offload_register_ver (unsigned version, const void *host_table,
>                                 int target_type, const void *target_data)
>      { [...]
>        if (GOMP_VERSION_LIB (version) > GOMP_VERSION)
>          gomp_fatal ("Library too old for offload (version %u < %u)",
>                      GOMP_VERSION, GOMP_VERSION_LIB (version));
> 
> I don't have a problem with your change per se, but wouldn't we still be
> able to load such code, given that we only changed the libgomp-interal
> libgomp-plugin interface?
> 
> Am I confused?
> 
> Or is the above just an (unavoidable?) side effect, because we do need to
> increment "GOMP_VERSION" for this check here:
> 
>        if (device->version_func () != GOMP_VERSION)
>          {
>            err = "plugin version mismatch";
>            goto fail;
>          }
> 
> ..., which is making sure that the libgomp proper vs. libgomp-plugin
> versions match.

The intended effect is exactly to ensure libgomp proper vs plugin compatibility.
We don't ensure backward/forward compatibility between libgomp/plugin, and
this version equality check is what enforces that.

>> --- a/libgomp/libgomp.map
>> +++ b/libgomp/libgomp.map
>> @@ -458,7 +462,6 @@ GOMP_PLUGIN_1.0 {
>>   	GOMP_PLUGIN_debug;
>>   	GOMP_PLUGIN_error;
>>   	GOMP_PLUGIN_fatal;
>> -	GOMP_PLUGIN_async_unmap_vars;
>>   	GOMP_PLUGIN_acc_thread;
>>   };
> 
> I think that's fine, but highlighting this again for Jakub, in case
> there's an issue with removing a symbol from the libgomp-plugin
> interface.

Since we don't enforce the libgomp/plugin interface to be compatible across
versions, I expect this to be okay.

>> --- a/libgomp/libgomp-plugin.h
>> +++ b/libgomp/libgomp-plugin.h
> 
>> +/* Opaque type to represent plugin-dependent implementation of an
>> +   OpenACC asynchronous queue.  */
>> +struct goacc_asyncqueue;
>> +
>> +/* Used to keep a list of active asynchronous queues.  */
>> +struct goacc_asyncqueue_list
>> +{
>> +  struct goacc_asyncqueue *aq;
>> +  struct goacc_asyncqueue_list *next;
>> +};
>> +
>> +typedef struct goacc_asyncqueue *goacc_aq;
>> +typedef struct goacc_asyncqueue_list *goacc_aq_list;
> 
> I'm not too fond of such "syntactic sugar" typedefs, but if that's fine
> for Jakub to have in libgomp, then I won't object.
> 
> I'd be in favor then of "typedef struct N *N" or "typedef struct N *N_t"
> variants however, instead of introducing yet another "goacc_aq" acronym
> next to "goacc_asyncqueue", and "async queue" or "asynchronous queue" as
> used in the descriptive texts (comments, etc.).  Maybe standardize all
> these to "asyncqueue", also in the descriptive texts?
> 
> OpenACC, by the way, uses the term "device activity queue" (in most?
> places...) to describe the underlying mechanism used to implement the
> OpenACC "async" clause etc.

Please, no more name style changes, please... Orz (beg)

I think I originally thought of "asyncqueue" too, but I felt a shorthand was
needed in many places, and the straightforward "aq" seemed too short to be
comfortably informative. The "goacc_" prefix seemed just right.

Besides, the crucial name convention I had in mind was "queues" in OpenACC,
versus "streams" in CUDA. I don't see much value in further spinning on this
name.

> Should "struct goacc_asyncqueue_list" and its typedef still be defined
> here in "libgomp/libgomp-plugin.h" (for proximity to the other stuff),
> even though it's not actually used in the libgomp-plugin interface?

It looks like it can indeed be placed in libgomp.h, maybe just before the
declaration of acc_dispatch_t.

I originally placed it there in libgomp-plugin.h simply to
collect the declarations related to goacc_asyncqueue together in one place.
Is separating them really better?

>> --- a/libgomp/libgomp.h
>> +++ b/libgomp/libgomp.h
>> @@ -888,19 +888,23 @@ typedef struct acc_dispatch_t
> [...]
>> +  struct {
>> +    gomp_mutex_t lock;
>> +    int nasyncqueue;
>> +    struct goacc_asyncqueue **asyncqueue;
>> +    struct goacc_asyncqueue_list *active;
> [...]
>> +  } async;
> 
> For "lock" see my comments elsewhere.

I'll respond to that part there later as well.

> That data structure itself should be fine, no need for something more
> complex, given that users typically only use a handful of such queues,
> with low integer ID async-arguments.
> I'd maybe name these members "queues_n", "queues", "queues_active".
> 
> 
> As for the following changes, will you please make sure that there is one
> common order for these, used in "libgomp/libgomp-plugin.h" function
> prototypes, "libgomp/libgomp.h:acc_dispatch_t",
> "libgomp/target.c:gomp_load_plugin_for_device", "libgomp/oacc-host.c"
> function definitions as well as in "host_dispatch", and the
> libgomp-plugin(s) themselves (that's all, I think?).

Okay, I'll update that.

Thanks,
Chung-Lin



>> --- a/libgomp/libgomp-plugin.h
>> +++ b/libgomp/libgomp-plugin.h
>> @@ -93,22 +107,31 @@ extern bool GOMP_OFFLOAD_dev2dev (int, void *, const void *, size_t);
>>   extern bool GOMP_OFFLOAD_can_run (void *);
>>   extern void GOMP_OFFLOAD_run (int, void *, void *, void **);
>>   extern void GOMP_OFFLOAD_async_run (int, void *, void *, void **, void *);
>> +
>>   extern void GOMP_OFFLOAD_openacc_exec (void (*) (void *), size_t, void **,
>> -				       void **, int, unsigned *, void *);
>> -extern void GOMP_OFFLOAD_openacc_register_async_cleanup (void *, int);
>> -extern int GOMP_OFFLOAD_openacc_async_test (int);
>> -extern int GOMP_OFFLOAD_openacc_async_test_all (void);
>> -extern void GOMP_OFFLOAD_openacc_async_wait (int);
>> -extern void GOMP_OFFLOAD_openacc_async_wait_async (int, int);
>> -extern void GOMP_OFFLOAD_openacc_async_wait_all (void);
>> -extern void GOMP_OFFLOAD_openacc_async_wait_all_async (int);
>> -extern void GOMP_OFFLOAD_openacc_async_set_async (int);
>> +				       void **, unsigned *, void *);
>> +extern void GOMP_OFFLOAD_openacc_async_exec (void (*) (void *), size_t, void **,
>> +					     void **, unsigned *, void *,
>> +					     struct goacc_asyncqueue *);
>> +extern struct goacc_asyncqueue *GOMP_OFFLOAD_openacc_async_construct (void);
>> +extern bool GOMP_OFFLOAD_openacc_async_destruct (struct goacc_asyncqueue *);
>> +extern int GOMP_OFFLOAD_openacc_async_test (struct goacc_asyncqueue *);
>> +extern void GOMP_OFFLOAD_openacc_async_synchronize (struct goacc_asyncqueue *);
>> +extern void GOMP_OFFLOAD_openacc_async_serialize (struct goacc_asyncqueue *,
>> +						  struct goacc_asyncqueue *);
>> +extern void GOMP_OFFLOAD_openacc_async_queue_callback (struct goacc_asyncqueue *,
>> +						       void (*)(void *), void *);
>> +extern bool GOMP_OFFLOAD_openacc_async_host2dev (int, void *, const void *, size_t,
>> +						 struct goacc_asyncqueue *);
>> +extern bool GOMP_OFFLOAD_openacc_async_dev2host (int, void *, const void *, size_t,
>> +						 struct goacc_asyncqueue *);
>>   extern void *GOMP_OFFLOAD_openacc_create_thread_data (int);
>>   extern void GOMP_OFFLOAD_openacc_destroy_thread_data (void *);
>>   extern void *GOMP_OFFLOAD_openacc_cuda_get_current_device (void);
>>   extern void *GOMP_OFFLOAD_openacc_cuda_get_current_context (void);
>> -extern void *GOMP_OFFLOAD_openacc_cuda_get_stream (int);
>> -extern int GOMP_OFFLOAD_openacc_cuda_set_stream (int, void *);
>> +extern void *GOMP_OFFLOAD_openacc_cuda_get_stream (struct goacc_asyncqueue *);
>> +extern int GOMP_OFFLOAD_openacc_cuda_set_stream (struct goacc_asyncqueue *,
>> +						 void *);
>>   
>>   #ifdef __cplusplus
>>   }
> 
>> --- a/libgomp/libgomp.h
>> +++ b/libgomp/libgomp.h
>> @@ -888,19 +888,23 @@ typedef struct acc_dispatch_t
>>     /* Execute.  */
>>     __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func;
>>   
>> -  /* Async cleanup callback registration.  */
>> -  __typeof (GOMP_OFFLOAD_openacc_register_async_cleanup)
>> -    *register_async_cleanup_func;
>> -
>> -  /* Asynchronous routines.  */
>> -  __typeof (GOMP_OFFLOAD_openacc_async_test) *async_test_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_test_all) *async_test_all_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_wait) *async_wait_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_wait_async) *async_wait_async_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_wait_all) *async_wait_all_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_wait_all_async)
>> -    *async_wait_all_async_func;
>> -  __typeof (GOMP_OFFLOAD_openacc_async_set_async) *async_set_async_func;
>> +  struct {
>> +    gomp_mutex_t lock;
>> +    int nasyncqueue;
>> +    struct goacc_asyncqueue **asyncqueue;
>> +    struct goacc_asyncqueue_list *active;
>> +
>> +    __typeof (GOMP_OFFLOAD_openacc_async_construct) *construct_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_destruct) *destruct_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_test) *test_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_synchronize) *synchronize_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_serialize) *serialize_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_queue_callback) *queue_callback_func;
>> +
>> +    __typeof (GOMP_OFFLOAD_openacc_async_exec) *exec_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_host2dev) *host2dev_func;
>> +    __typeof (GOMP_OFFLOAD_openacc_async_dev2host) *dev2host_func;
>> +  } async;
>>   
>>     /* Create/destroy TLS data.  */
>>     __typeof (GOMP_OFFLOAD_openacc_create_thread_data) *create_thread_data_func;
> 
> 
> Grüße
>   Thomas
> 



More information about the Gcc-patches mailing list