This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/4, libgomp] Resolve deadlock on plugin exit, HSA plugin parts


Hi,

On Sun, Mar 27, 2016 at 06:26:29PM +0800, Chung-Lin Tang wrote:
> On 2016/3/25 äå 02:40, Martin Jambor wrote:
> > On the whole, I am fine with the patch but there are two issues:
> > 
> > First, and generally, when you change the return type of a function,
> > you must document what return values mean in the comment of the
> > function.  Most importantly, it must be immediately apparent whether a
> > function returns true or false on failure from its comment.  So please
> > fix that.
> 
> Thanks, I'll update on that.
> 
> >> >  /* Callback of dispatch queues to report errors.  */
> >> > @@ -454,7 +471,7 @@ queue_callback (hsa_status_t status,
> >> >  		hsa_queue_t *queue __attribute__ ((unused)),
> >> >  		void *data __attribute__ ((unused)))
> >> >  {
> >> > -  hsa_fatal ("Asynchronous queue error", status);
> >> > +  hsa_error ("Asynchronous queue error", status);
> >> >  }
> > ...I believe this hunk is wrong.  Errors reported in this way mean
> > that something is very wrong and generally happen during execution of
> > code on HSA GPU, i.e. within GOMP_OFFLOAD_run.  And since you left
> > calls in create_single_kernel_dispatch, which is called as a part of
> > GOMP_OFFLOAD_run, intact, I believe you actually want to leave
> > hsa_fatel here too.
> 
> Yes, a fatal exit is okay within the 'run' hook, since we're not holding
> the device lock there. I was only trying to audit the GOMP_OFFLOAD_init_device()
> function, where the queues are created.
> 
> I'm not familiar with the HSA runtime API; will the callback only be triggered
> during GPU kernel execution (inside the 'run' hook), and not for example,
> within hsa_queue_create()? If so, then yes as you advised, the above change to
> queue_callback() should be reverted.
> 

The documentation says the callback is "invoked by the HSA runtime for
every asynchronous event related to the newly created queue."  All
enumerated situations when the callback is called happen at command
launch time (i.e. inside a run hook).

Since creation of the queue is a synchronous event, callback should
not be invoked if it fails.  But of course, the description does not
rule out such failures do not occur out of the blue at any arbitrary
time.  But I think this is as improbable as an GOMP_PLUGIN_malloc
ending up in a fatal error, which is something you do not seem to be
worried about.

So please revert the hunk.

Thanks,

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]