This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/4, libgomp] Resolve deadlock on plugin exit, HSA plugin parts


Hi,

On Mon, Mar 21, 2016 at 06:22:17PM +0800, Chung-Lin Tang wrote:
> Hi Martin, I think you're the one to CC for this,
> as I mentioned in the first email, this has been build tested, however I did
> not know if I could test this without a Radeon card.  If convenient,
> could you or anyone familiar with the setup do a make check-target-libgomp
> with this patch series?
> 
> Thanks,
> Chung-Lin
> 
> 
>         * plugin/plugin-hsa.c (hsa_warn): Adjust 'hsa_error' local variable
>         to 'hsa_error_msg', for clarity.
>         (hsa_fatal): Likewise.
>         (hsa_error): New function.
>         (init_hsa_context): Change return type to bool, adjust to return
>         false on error.
>         (queue_callback): Adjust to call hsa_error.
>         (GOMP_OFFLOAD_get_num_devices): Adjust to handle init_hsa_context
>         return value.
>         (GOMP_OFFLOAD_init_device): Change return type to bool, adjust to
>         return false on error.
>         (get_agent_info): Adjust to return NULL on error.
>         (destroy_hsa_program): Change return type to bool, adjust to
>         return false on error.
>         (GOMP_OFFLOAD_load_image): Adjust to return -1 on error.
>         (destroy_module): Change return type to bool, adjust to
>         return false on error.
>         (GOMP_OFFLOAD_unload_image): Likewise.
>         (GOMP_OFFLOAD_fini_device): Likewise.
>         (GOMP_OFFLOAD_alloc): Change to return NULL when called.
>         (GOMP_OFFLOAD_free): Change to return false when called.
>         (GOMP_OFFLOAD_dev2host): Likewise.
>         (GOMP_OFFLOAD_host2dev): Likewise.
>         (GOMP_OFFLOAD_dev2dev): Likewise.

On the whole, I am fine with the patch but there are two issues:

First, and generally, when you change the return type of a function,
you must document what return values mean in the comment of the
function.  Most importantly, it must be immediately apparent whether a
function returns true or false on failure from its comment.  So please
fix that.

Second...

> Index: libgomp/plugin/plugin-hsa.c
> ===================================================================
> --- libgomp/plugin/plugin-hsa.c	(revision 234358)
> +++ libgomp/plugin/plugin-hsa.c	(working copy)
> @@ -175,10 +175,10 @@ hsa_warn (const char *str, hsa_status_t status)
>    if (!debug)
>      return;
>  
> -  const char *hsa_error;
> -  hsa_status_string (status, &hsa_error);
> +  const char *hsa_error_msg;
> +  hsa_status_string (status, &hsa_error_msg);
>  
> -  fprintf (stderr, "HSA warning: %s\nRuntime message: %s", str, hsa_error);
> +  fprintf (stderr, "HSA warning: %s\nRuntime message: %s", str, hsa_error_msg);
>  }
>  
>  /* Report a fatal error STR together with the HSA error corresponding to STATUS
> @@ -187,12 +187,25 @@ hsa_warn (const char *str, hsa_status_t status)
>  static void
>  hsa_fatal (const char *str, hsa_status_t status)
>  {
> -  const char *hsa_error;
> -  hsa_status_string (status, &hsa_error);
> +  const char *hsa_error_msg;
> +  hsa_status_string (status, &hsa_error_msg);
>    GOMP_PLUGIN_fatal ("HSA fatal error: %s\nRuntime message: %s", str,
> -		     hsa_error);
> +		     hsa_error_msg);
>  }
>  
> +/* Like hsa_fatal, except only report error message, and return FALSE
> +   for propagating error processing to outside of plugin.  */
> +
> +static bool
> +hsa_error (const char *str, hsa_status_t status)
> +{
> +  const char *hsa_error_msg;
> +  hsa_status_string (status, &hsa_error_msg);
> +  GOMP_PLUGIN_error ("HSA fatal error: %s\nRuntime message: %s", str,
> +		     hsa_error_msg);
> +  return false;
> +}
> +
>  struct hsa_kernel_description
>  {
>    const char *name;

...

>  /* Callback of dispatch queues to report errors.  */
> @@ -454,7 +471,7 @@ queue_callback (hsa_status_t status,
>  		hsa_queue_t *queue __attribute__ ((unused)),
>  		void *data __attribute__ ((unused)))
>  {
> -  hsa_fatal ("Asynchronous queue error", status);
> +  hsa_error ("Asynchronous queue error", status);
>  }

...I believe this hunk is wrong.  Errors reported in this way mean
that something is very wrong and generally happen during execution of
code on HSA GPU, i.e. within GOMP_OFFLOAD_run.  And since you left
calls in create_single_kernel_dispatch, which is called as a part of
GOMP_OFFLOAD_run, intact, I believe you actually want to leave
hsa_fatel here too.

Thanks,

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]