This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [openacc] tile, independent, default, private and firstprivate support in c/++
- From: Nathan Sidwell <nathan at codesourcery dot com>
- To: Thomas Schwinge <thomas at codesourcery dot com>, Cesar Philippidis <cesar at codesourcery dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Thu, 5 Nov 2015 12:13:02 -0500
- Subject: Re: [openacc] tile, independent, default, private and firstprivate support in c/++
- Authentication-results: sourceware.org; auth=none
- References: <5639325B dot 90006 at codesourcery dot com> <87k2pwzlqa dot fsf at kepler dot schwinge dot homeip dot net> <563B6C1E dot 8070408 at codesourcery dot com> <87twp02xdr dot fsf at schwinge dot name>
On 11/05/15 12:01, Thomas Schwinge wrote:
On Thu, 5 Nov 2015 06:47:58 -0800, Cesar Philippidis <cesar@codesourcery.com> wrote:
On 11/05/2015 04:14 AM, Thomas Schwinge wrote:
Sorry, I must have mis-phrased it. The spec is unclear here. There are
three possible ways to interpret 'acc parallel loop reduction':
1. acc parallel reduction
acc loop
This is what you propose in your patch, but I don't think that makes
sense, or does it? I'm happy to learn otherwise, but in my current
understanding, a reduction clause needs to be attached (at least) to the
innermost construct where reductions are to be processed. (Let's also
Correct, the above interpretation must be wrong.
consider multi-level gang/worker/vector loops/reductions.) So, either:
2. acc parallel
acc loop reduction
... this, or even this:
3. acc parallel reduction
acc loop reduction
..., which I'm not sure what the execution model implementation requires.
(Nathan?)
interpretation #2 is sufficient, I think. However, both are lacking a 'copy
(reduction_var)', clause as otherwise there's nothing changing the default data
attribute of 'firstprivate' (working on that patch). Perhaps 'reduction' on
'parallel' is meant to imply that (because that's what makes sense), but the
std doesn't say it.
In summary it's probably safe to implement interpretation #3. That way we can
implement the hypothesis that reductions at the outer construct imply copy.
And while we're at it: the very same question also applies to the private
clause, which -- contrary to all other (as far as I remember) clauses --
also is applicable to both the parallel and loop constructs:
#pragma acc parallel loop private([...])
... is to be decomposed into which of the following:
#pragma acc parallel private([...])
#pragma acc loop
#pragma acc parallel
#pragma acc loop private([...])
#pragma acc parallel private([...])
#pragma acc loop private([...])
(There is no private clause allowed to be specified with the kernels
construct for what it's worth, but that doesn't mean we couldn't use it
internally, of course, if so required.)
I think interpretation #2 or #3 make sense, and I suspect result in the same
emitted code.
nathan
--
Nathan Sidwell
- References:
- [openacc] tile, independent, default, private and firstprivate support in c/++
- Re: [openacc] tile, independent, default, private and firstprivate support in c/++
- Re: [openacc] tile, independent, default, private and firstprivate support in c/++
- Re: [openacc] tile, independent, default, private and firstprivate support in c/++