This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, og8] Add OpenACC 2.6 `acc_get_property' support
- From: Chung-Lin Tang <chunglin_tang at mentor dot com>
- To: "Maciej W. Rozycki" <macro at codesourcery dot com>, <gcc-patches at gcc dot gnu dot org>
- Cc: Thomas Schwinge <thomas at codesourcery dot com>, Chung-Lin Tang <cltang at codesourcery dot com>, Jakub Jelinek <jakub at redhat dot com>, Catherine Moore <clm at codesourcery dot com>, Tom de Vries <tdevries at suse dot de>
- Date: Wed, 5 Dec 2018 18:12:23 +0800
- Subject: Re: [PATCH, og8] Add OpenACC 2.6 `acc_get_property' support
- References: <alpine.DEB.2.21.9999.1812031551070.55818@build7-trusty-cs.sje.mentorg.com>
- Reply-to: <cltang at codesourcery dot com>
Hi Maciej, please see below:
On 2018/12/4 12:51 AM, Maciej W. Rozycki wrote:
+module openacc_c_string
+ implicit none
+
+ interface
+ function strlen (s) bind (C, name = "strlen")
+ use iso_c_binding, only: c_ptr, c_size_t
+ type (c_ptr), intent(in), value :: s
+ integer (c_size_t) :: strlen
+ end function
+ end interface
+
+end module
+subroutine acc_get_property_string_h (n, d, p, s)
+ use iso_c_binding, only: c_char, c_int, c_ptr, c_f_pointer
+ use openacc_internal, only: acc_get_property_string_l
+ use openacc_c_string, only: strlen
+ use openacc_kinds
...> + pint = int (p, c_int)
+ cptr = acc_get_property_string_l (n, d, pint)
+ clen = int (strlen (cptr))
+ call c_f_pointer (cptr, sptr, [clen])
AFAIK, things like strlen are already available in iso_c_binding, in forms like "C_strlen".
Can you check again if that 'openacc_c_string' module is really necessary?
+union gomp_device_property_value
+GOMP_OFFLOAD_get_property (int n, int prop)
+{
+ union gomp_device_property_value propval = { .val = 0 };
+
+ pthread_mutex_lock (&ptx_dev_lock);
+
+ if (!nvptx_init () || n >= nvptx_get_num_devices ())
+ {
+ pthread_mutex_unlock (&ptx_dev_lock);
+ return propval;
+ }
+
+ switch (prop)
+ {
+ case GOMP_DEVICE_PROPERTY_MEMORY:
+ {
+ size_t total_mem;
+ CUdevice dev;
+
+ CUDA_CALL_ERET (propval, cuDeviceGet, &dev, n);
+ CUDA_CALL_ERET (propval, cuDeviceTotalMem, &total_mem, dev);
+ propval.val = total_mem;
+ }
+ break;
+ case GOMP_DEVICE_PROPERTY_FREE_MEMORY:
+ {
+ size_t total_mem;
+ size_t free_mem;
+ CUdevice ctxdev;
+ CUdevice dev;
+
+ CUDA_CALL_ERET (propval, cuCtxGetDevice, &ctxdev);
+ CUDA_CALL_ERET (propval, cuDeviceGet, &dev, n);
+ if (dev == ctxdev)
+ CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
+ else if (ptx_devices[n])
+ {
+ CUcontext old_ctx;
+
+ CUDA_CALL_ERET (propval, cuCtxPushCurrent, ptx_devices[n]->ctx);
+ CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
+ CUDA_CALL_ASSERT (cuCtxPopCurrent, &old_ctx);
+ }
+ else
+ {
+ CUcontext new_ctx;
+
+ CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO,
+ dev);
+ CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
+ CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx);
+ }
(I'm CCing Tom here, as he is maintainer for these parts)
As we discussed earlier on our internal list, I think properly using GOMP_OFFLOAD_init_device
is the right way, instead of using the lower level CUDA context create/destroy.
I did not mean for you to first init the device and then immediately destroy it by
GOMP_OFFLOAD_fini_device, just to obtain the property, but for you to just take the opportunity to initialize
it for use, and leave it there until program exit. That should save resources overall.
(BTW, CUDA contexts should be quite expensive to create/destroy, using a cuCtxCreate/Destroy pair is probably
almost as slow)
Tom, do you have any comments on how to best write this part?
Thanks,
Chung-Lin