[RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)

Tobias Burnus tobias@codesourcery.com
Mon Dec 6 14:00:30 GMT 2021


This is a RFC/WIP patch about:

(A) OpenMP (C/C++/Fortran)
    omp target map(iterator(i=n:m),to : x(i))

(B) Fortran:
(1)   omp target map(to : dt_var, class_var)
(2)   omp parallel allocator(my_alloc) firstprivate(class_var)
(3)  call co_reduce(dt_coarray, my_func)

The problem with (A) is that there is not a compile-time countable
number of iterations such that it cannot be easily add to the array
used to call GOMP_target_ext.

The problem with (B) is that dt_var can have allocatable components
which complicates stuff and with recursive types, the number of
elements it not known at compile time - not with polymorphic types
as it depends on the recursion depth and dynamic type, respectively.


Comments/questions/remarks ... to the proposal below?

Regarding mapping, I currently have no idea how to handle
the virtual table. Thoughts?

  * * *

The idea for OpenMP mapping is a callback function - such that

integer function f() result(ires)
   implicit none
   integer :: a
   !$omp target  map(iterator(i=1:5), to: a)
   !$omp end target
   ires = 7
end

becomes

   #pragma omp target map(iterator(integer(kind=4) i=1:5:1):to:a)

and then during gimplify:

   #pragma omp target num_teams(1) thread_limit(0) map(map_function:f_._omp_mapfn.0 [len: 0])

with

unsigned long f_._omp_mapfn.0 (unsigned long (*<T626>) (void *) cb_fn,
                                void * token, void * base, unsigned short flags)
{
...

with the loop around the cb_fn call and flag = GOMP_MAP_TO.

(Not fully working yet. ME part needs still to generate the
loop similar to depend or affinity. For C/C++, the basic
parsing is done but some more code changes are needed
in the FE.)


  * * *

Fortran - with an OpenMP example:

module m
   implicit none (type, external)
   type t3
   end type t3
   type t
     class(t3), allocatable :: cx
     type(t3), pointer :: ptx
   end type t
end module m

use m
implicit none (type, external)
class(t), allocatable :: var

!$omp target map(to:var)
   if (allocated(var)) stop 1
!$omp end target
end


The idea is that this becomes:

   #pragma omp target map(to:var) map(map_function:var._vptr->_callback [len: 1]) map(to:var [len: 0])

That's:
* 'var' is first normally mapped
* Then the map function is added which gets 'var' as argument


(For an array, I plan to add an internal function which calls the
callback function in a scalarization loop.)


On the Fortran side - this requires in the vtable a new entry,
(*ABI breakage*) which points to:

integer(kind=8) __callback_m_T (
    integer(kind=8) (*<T655>) (void *, void *, integer(kind=8),
                               void (*<T6d>) (void), integer(kind=2)) cb,
    void * token, struct t & restrict scalar, integer(kind=4) f_flags)
{
   __result___callback_m_T = 0;
   if (scalar->cx._data != 0B)
     {
         void * D.4384;
         D.4384 = (void *) scalar->cx._data;
         __result___callback_m_T = cb (token, D.4384, scalar->cx._vptr->_size, 0B, 0)
                                   + __result___callback_m_T;
       __result___callback_m_T = cb (token, *scalar->cx._data, 0, *scalar->cx._vptr->_callback, 0)
                                 + __result___callback_m_T;
     }
   if (scalar->ptx != 0B)
     {
         void * D.4386;
         D.4386 = (void *) scalar->ptx;
         __result___callback_m_T = cb (token, D.4386, 0, 0B, 0) + __result___callback_m_T;
     }
   return __result___callback_m_T;
}


That is:

* For pointer, the CB is called with SIZE = 0, permitting the caller to
   remap pointer - or ignore the callback call.
* For allocatables, it passes the SIZE, permitting to map the allocatable
* If the allocatable is a CLASS or has allocatable components, cb is
   called with a callback function - which that those can be mapped as well.
   (and SIZE = 0)

(The GOMP_MAP_TO needs to be handled by libgomp, e.g. by putting it into
the void *token.)


The vtable's callback function can then also be used with
* OpenMP ALLOCATOR or for
* deep copying with CO_REDUCE.


Question: Does this way of passing make sense or not?
Comments?


Tobias


PS: The patch has a lot of pieces in places, but still lacks both
some glue code and some other bit. :-/
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deep-map.diff
Type: text/x-patch
Size: 57030 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20211206/77e247d8/attachment-0001.bin>


More information about the Gcc-patches mailing list