This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [OpenACC 0/7] host_data construct


On Thu, 19 Nov 2015 14:13:45 +0100
Jakub Jelinek <jakub@redhat.com> wrote:

> On Wed, Nov 18, 2015 at 12:47:47PM +0000, Julian Brown wrote:
> 
> The FE/gimplifier part is okay, but I really don't like the
> omp-low.c changes, mostly the *lookup_decl_in_outer_ctx* changes.
> If I count well, we have right now 27 maybe_lookup_decl_in_outer_ctx
> callers and 7 lookup_decl_in_outer_ctx callers, you want to change
> behavior of 1 maybe_lookup_decl_in_outer_ctx and 1
> lookup_decl_in_outer_ctx.  Why exactly those 2 and not the others?

The not-very-good reason is that those are the merely the places that
allowed the supplied examples to work, and I'm wary of changing other
code that I don't understand very well.

> What are the exact rules (what does the standard say about it)?
> I'd expect that all phases (scan_sharing_clauses, lower_omp* and
> expand_omp*) should agree on the same behavior, otherwise I can't see
> how it can work properly.

OK, thanks -- as to what the standard says, it's so ill-specified in
this area that nothing can be learned about the behaviour of offloaded
regions within host_data constructs, and my question about that on the
technical mailing list is still unanswered (actually Nathan suggested
in private mail that the conservative thing to do would be to disallow
offloaded regions entirely within host_data constructs, so maybe that's
the way to go).

OpenMP 4.5 seems to *not* specify the skipping-over behaviour for
use_device_ptr variables (p105, lines 20-23):

"The is_device_ptr clause is used to indicate that a list item is a
device pointer already in the device data environment and that it
should be used directly. Support for device pointers created outside
of OpenMP, specifically outside of the omp_target_alloc routine and the
use_device_ptr clause, is implementation defined."

That suggests that use_device_ptr is a valid way to create device
pointers for use in enclosed target regions: the behaviour I assumed
was wrong for OpenACC. So I think my guess at the "most-obvious"
behaviour was probably misguided anyway.

It's maybe even more complicated. Consider the example:

char x[1024];

#pragma acc enter data copyin(x)

#pragma acc host_data use_device(x)
{
  target_primitive(x);
  #pragma acc parallel present(x)    [1]
  {
    x[5] = 0;                        [2]
  }
}

Here, the "present" clause marked [1] will fail (because 'x' is a
target pointer now). If it's omitted, the array access [2] will cause an
implicit present_or_copy to be used for the 'x' pointer (which again
will fail, because now 'x' points to target data). Maybe what we
actually need is,

#pragma acc host_data use_device(x)
{
  target_primitive(x);
  #pragma acc parallel deviceptr(x)
  {
    ...
  }
}

with the deviceptr(x) clause magically substituted in the parallel
construct, but I'm struggling to see how we could justify doing that
when that behaviour's not mentioned in the spec at all.

Aha, so: maybe manually using deviceptr(x) is implicitly mandatory in
this situation, and missing it out should be an error? That suddenly
seems to make most sense. I'll see about fixing the patch to do that.

Julian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]