[gomp4.1] Handle new form of #pragma omp declare target
Ilya Verbin
iverbin@gmail.com
Mon Oct 26 19:49:00 GMT 2015
On Mon, Oct 26, 2015 at 20:05:39 +0100, Jakub Jelinek wrote:
> On Mon, Oct 26, 2015 at 09:35:52PM +0300, Ilya Verbin wrote:
> > On Fri, Jul 17, 2015 at 15:05:59 +0200, Jakub Jelinek wrote:
> > > As the testcases show, #pragma omp declare target has now a new form (well,
> > > two; with some issues on it pending), where it is used just as a single
> > > declarative directive rather than a pair of them and allows marking
> > > vars and functions by name as "omp declare target" vars/functions (which the
> > > middle-end etc. already handles), but also "omp declare target link", which
> > > is a deferred var, that is not initially mapped (on devices without shared
> > > memory with host), but has to be mapped explicitly.
> >
> > I don't quite understand how link should work. OpenMP 4.5 says:
> >
> > "The list items of a link clause are not mapped by the declare target directive.
> > Instead, their mapping is deferred until they are mapped by target data or
> > target constructs. They are mapped only for such regions."
> >
> > But doesn't this mean that the example bellow should work identically
> > with/without USE_LINK defined? Or is there some difference on other testcases?
>
> On your testcase, the end result is pretty much the same, the variable is
> not mapped initially to the device, and at the beginning of omp target it is
> mapped to device, at the end of the region it is unmapped from the device
> (without copying back).
>
> But consider:
>
> int a = 1, b = 1;
> #pragma omp declare target link (a) to (b)
> int
> foo (void)
> {
> return a++ + b++;
> }
> #pragma omp declare target to (foo)
> int
> main ()
> {
> a = 2;
> b = 2;
> int res;
> #pragma omp target map (to: a, b) map (from: res)
> {
> res = foo () + foo ();
> }
> // This assumes only non-shared address space, so would need to be guarded
> // for that.
> if (res != (2 + 1) + (3 + 2))
> __builtin_abort ();
> return 0;
> }
>
> Without declare target link or to, you can't use the global variables
> in orphaned accelerated routines (unless you e.g. take the address of the
> mapped variable in the region and pass it around).
> The to variables (non-deferred) are always mapped and are initialized with
> the original initializer, refcount is infinity. link (deferred) work more
> like the normal mapping, referencing those vars when they aren't explicitly
> (or implicitly) mapped is unspecified behavior, if it is e.g. mapped freshly
> with to kind, it gets the current value of the host var rather than the
> original one. But, beyond the mapping the compiler needs to ensure that
> all uses of the link global var (or perhaps just all uses of the link global
> var outside of the target construct body where it is mapped, because you
> could use there the pointer you got from GOMP_target) are replaced by
> dereference of some artificial pointer, so a becomes *a_tmp and &a becomes
> &*a_tmp, and that the runtime library during registration of the tables is
> told about the address of this artificial pointer. During registration,
> I'd expect it would stick an entry for this range into the table, with some
> special flag or something similar, indicating that it is deferred mapping
> and where the offloading device pointer is. During mapping, it would map it
> as any other not yet mapped object, but additionally would also set this
> device pointer to the device address of the mapped object. We also need to
> ensure that when we drop the refcount of that mapping back to 0, we get it
> back to the state where it is described as a range with registered deferred
> mapping and where the device pointer is.
Ok, got it, I'll try implement this...
> > > we actually replace the variables with pointers to variables, then need
> > > to somehow also mark those in the offloading tables, so that the library
> >
> > I see 2 possible options: use the MSB of the size, or introduce the third field
> > for flags.
>
> Well, it can be either recorded in the host variable tables (which contain
> address and size pair, right), or in corresponding offloading device table
> (which contains the pointer, something else?).
It contains a size too, which is checked in libgomp:
gomp_fatal ("Can't map target variables (size mismatch)");
Yes, we can remove this check, and use second field in device table for flags.
-- Ilya
More information about the Gcc-patches
mailing list