This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, og8] Add OpenACC 2.6 `acc_get_property' support


Hi Chung-Lin,

> > +module openacc_c_string
> > +  implicit none
> > +
> > +  interface
> > +    function strlen (s) bind (C, name = "strlen")
> > +      use iso_c_binding, only: c_ptr, c_size_t
> > +      type (c_ptr), intent(in), value :: s
> > +      integer (c_size_t) :: strlen
> > +    end function
> > +  end interface
> > +
> > +end module
> 
> > +subroutine acc_get_property_string_h (n, d, p, s)
> > +  use iso_c_binding, only: c_char, c_int, c_ptr, c_f_pointer
> > +  use openacc_internal, only: acc_get_property_string_l
> > +  use openacc_c_string, only: strlen
> > +  use openacc_kinds
> ...> +  pint = int (p, c_int)
> > +  cptr = acc_get_property_string_l (n, d, pint)
> > +  clen = int (strlen (cptr))
> > +  call c_f_pointer (cptr, sptr, [clen])
> 
> AFAIK, things like strlen are already available in iso_c_binding, in forms
> like "C_strlen".
> Can you check again if that 'openacc_c_string' module is really necessary?

 Any pointers please?

 I can't see `c_strlen' or any equivalent interface defined either in the 
Fortran 2003 language standard or in GCC documentation, and neither `grep' 
over the GCC tree shows anything relevant.  The `iso_c_binding' module 
defines only a bunch of procedures according to said documentation.  The 
`strlen' function provided here has been taken from one of our Fortran 
test cases, which strongly indicates there's no such API already available 
or whoever wrote the test case would have chosen to use it I suppose.

> > +union gomp_device_property_value
> > +GOMP_OFFLOAD_get_property (int n, int prop)
> > +{
> > +  union gomp_device_property_value propval = { .val = 0 };
> > +
> > +  pthread_mutex_lock (&ptx_dev_lock);
> > +
> > +  if (!nvptx_init () || n >= nvptx_get_num_devices ())
> > +    {
> > +      pthread_mutex_unlock (&ptx_dev_lock);
> > +      return propval;
> > +    }
> > +
> > +  switch (prop)
> > +    {
> > +    case GOMP_DEVICE_PROPERTY_MEMORY:
> > +      {
> > +	size_t total_mem;
> > +	CUdevice dev;
> > +
> > +	CUDA_CALL_ERET (propval, cuDeviceGet, &dev, n);
> > +	CUDA_CALL_ERET (propval, cuDeviceTotalMem, &total_mem, dev);
> > +	propval.val = total_mem;
> > +      }
> > +      break;
> > +    case GOMP_DEVICE_PROPERTY_FREE_MEMORY:
> > +      {
> > +	size_t total_mem;
> > +	size_t free_mem;
> > +	CUdevice ctxdev;
> > +	CUdevice dev;
> > +
> > +	CUDA_CALL_ERET (propval, cuCtxGetDevice, &ctxdev);
> > +	CUDA_CALL_ERET (propval, cuDeviceGet, &dev, n);
> > +	if (dev == ctxdev)
> > +	  CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
> > +	else if (ptx_devices[n])
> > +	  {
> > +	    CUcontext old_ctx;
> > +
> > +	    CUDA_CALL_ERET (propval, cuCtxPushCurrent, ptx_devices[n]->ctx);
> > +	    CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
> > +	    CUDA_CALL_ASSERT (cuCtxPopCurrent, &old_ctx);
> > +	  }
> > +	else
> > +	  {
> > +	    CUcontext new_ctx;
> > +
> > +	    CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO,
> > +			    dev);
> > +	    CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem);
> > +	    CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx);
> > +	  }
> 
> (I'm CCing Tom here, as he is maintainer for these parts)
> 
> As we discussed earlier on our internal list, I think properly using
> GOMP_OFFLOAD_init_device
> is the right way, instead of using the lower level CUDA context
> create/destroy.
> 
> I did not mean for you to first init the device and then immediately destroy
> it by
> GOMP_OFFLOAD_fini_device, just to obtain the property, but for you to just
> take the opportunity to initialize
> it for use, and leave it there until program exit. That should save resources
> overall.
> (BTW, CUDA contexts should be quite expensive to create/destroy, using a
> cuCtxCreate/Destroy pair is probably
> almost as slow)

 I have argued that this looks like a corner-case use case to me, as 
querying for the remaining (rather than total) memory available to a 
device that hasn't been (yet) used looks like of hardly any use to me, 
because obviously at such a stage no memory has been used.  The OpenACC 
standard does require us to handle such a request somehow, with returning 
0 being another option, however I thought we may well have a quick peek 
without pulling in all the state.

 I guess I have no strong opinion either way and I can adapt accordingly.

 NB that would have to be `gomp_init_device' rather than 
`GOMP_OFFLOAD_init_device' AFAICS.

  Maciej


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]