[PATCH 1/3, libgomp] Adjust offload plugin interface for avoiding deadlock on exit

Chung-Lin Tang cltang@codesourcery.com
Wed Sep 9 08:12:00 GMT 2015


Ping.

On 2015/8/27 09:44 PM, Chung-Lin Tang wrote:
> We've discovered that, for several of the libgomp plugin interface routines,
> if the target specific routine calls exit() (usually upon a fatal condition),
> deadlock ensues. We found this using nvptx, but it's possible on intelmic as well.
> 
> This is due to many of the plugin routines are called with the device lock held,
> and when exit() is called inside the plugin code, the GOMP_unregister_var() destructor
> tries to iterate through and acquire all device locks to cleanup. Since we already hold
> one of the device locks, this just gets stuck.  Also because gomp_mutex_t is a
> simple futex based lock implementation (instead of pthreads), we don't have a
> trylock mechanism to use either.
> 
> So this patch tries to alleviate this problem by changing the plugin interface;
> the plugin routines that are called while holding the device lock are adjusted
> to assume to never fatal exit, but return a value back to libgomp proper to
> indicate execution results. The core libgomp code then may unlock and call gomp_fatal().
> 
> We believe this is the right route to solve the problem, since there's only
> two accel target plugins so far. Besides the nvptx plugin, I have made some effort
> to update the intelmic plugin as well, though it's not as thoroughly audited.
> Intel folks might want to further make sure your plugin code is free of this problem as well.
> 
> This patch contains the libgomp proper changes. The nvptx and intelmic patches follow.
> I have tested the libgomp testsuite without regressions for both accel targets, is this
> okay for trunk?
> 
> Thanks,
> Chung-Lin
> 
> 2015-08-27 Chung-Lin Tang <cltang@codesourcery.com>
> 
>         * oacc-host.c (host_init_device): Change return type to bool.
>         (host_fini_device): Likewise.
>         (host_dev2host): Likewise.
>         (host_host2dev): Likewise.
>         (host_free): Likewise.
>         (host_alloc): Change return type to bool, change to use out
>         parameter to return allocated pointer.
>         * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change,
>         handle fatal error.
>         (acc_free): Likewise.
>         (acc_memcpy_to_device): Likewise.
>         (acc_memcpy_from_device): Likewise.
>         * oacc-init.c (acc_init_1): Handle gomp_init_device return code,
>         handle fatal error.
>         (acc_set_device_type): Likewise.
>         (acc_set_device_num): Likewise.
>         * target.c (gomp_map_vars): Adjust alloc_func plugin hook call,
>         add device unlock, handle fatal error.
>         (gomp_unmap_tgt): Change return type to bool, adjust free_func
>         plugin call.
>         (gomp_copy_from_async): Handle dev2host_func return code, handle
>         fatal error.
>         (gomp_unmap_vars): Likewise.
>         (gomp_init_device): Change return type to bool, adjust call to
>         init_device_func plugin hook.
>         (GOMP_target): Adjust call to gomp_init_device, handle fatal error.
>         (GOMP_target_data): Likewise.
>         (GOMP_target_update): Likewise.
>         * libgomp.h (gomp_device_descr.init_device_func): Change return
>         type to bool.
>         (gomp_device_descr.fini_device_func): Likewise.
>         (gomp_device_descr.free_func): Likewise.
>         (gomp_device_descr.dev2host_func): Likewise.
>         (gomp_device_descr.host2dev_func) Likewise.
>         (gomp_device_descr.alloc_func): Change return
>         type to bool, use out parameter to return pointer.
>         (gomp_init_device): Change return
>         type to bool.
> 



More information about the Gcc-patches mailing list