This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, PR70895] Add copy mapping for reductions on OpenACC loop directives
- From: Thomas Schwinge <thomas at codesourcery dot com>
- To: Chung-Lin Tang <cltang at codesourcery dot com>
- Cc: Cesar Philippidis <cesar_philippidis at mentor dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Tue, 16 Aug 2016 13:06:21 +0200
- Subject: Re: [PATCH, PR70895] Add copy mapping for reductions on OpenACC loop directives
- Authentication-results: sourceware.org; auth=none
- References: <7fdd9375-4f2c-b3aa-0888-f725c91a13fd@codesourcery.com>
Hi!
On Mon, 15 Aug 2016 19:25:48 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> per the discussion on the bugzilla PR page, reductions on OpenACC loop
> directives will automatically get a copy clause mapping on an enclosing
> parallel construct (unless bounded by a local variable or an explicit
> firstprivate clause).
>
> There is also a patch for libgomp testsuite cases. Asides from the
> fortran case which now needs explicit firstprivate clauses to work,
> other C/C++ cases have been adjusted to remove explicit copy clauses.
> (I have not exhaustively searched everywhere to eliminate them though)
>
> This has been tested using gomp-4_0-branch, which is based on GCC 6,
> which is what this PR was originally filed for.
>
> I will be committing this soon for gomp-4_0-branch,
> is this okay for gcc-6-branch and trunk as well?
On Mon, 15 Aug 2016 15:23:14 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> The gimplify.c change is ok for trunk and 6.3 (after 6.2 is released).
> As for the testsuite, I'll leave it to Thomas/Nathan on what they prefer,
> I'd think that having explicit clauses in e.g. half of the testcases and
> implicit ones in the other half wouldn't hurt, so that both are tested
> enough.
ACK, but from a quick scan it seems as if there's still sufficient
coverage remaining with explicit usage.
What I'd like to see changed/added, though, is some libgomp.oacc-fortran
test coverage of the new implicit copy clauses, and a handful of C/C++ as
well as Fortran tree-scanning tests in gcc/testsuite/ -- basically, to
document the expected behavior.
> PR middle-end/70895
> gcc/
> * gimplify.c (omp_add_variable): Adjust/add variable mapping on
> enclosing parallel construct for reduction variables on OpenACC loop
> directives.
> --- gcc/gimplify.c (revision 239471)
> +++ gcc/gimplify.c (working copy)
> @@ -5897,6 +5897,37 @@ omp_add_variable (struct gimplify_omp_ctx *ctx, tr
> n->value |= flags;
> else
> splay_tree_insert (ctx->variables, (splay_tree_key)decl, flags);
> +
> + /* For reductions clauses in OpenACC loop directives, by default create a
> + copy clause on the enclosing parallel construct for carrying back the
> + results. */
> + if (ctx->region_type == ORT_ACC && (flags & GOVD_REDUCTION))
Should this be "ctx->region_type & ORT_ACC" instead of "=="? I suppose
the same thing also applies to OpenACC nested parallelism:
#pragma acc parallel
{
[...]
#pragma acc parallel reduction([...])
{
[...]
..., which we're not supporting right now, but that'll make it easier to
spot once adding such support.
> + {
> + struct gimplify_omp_ctx *outer_ctx = ctx->outer_context;
> + while (outer_ctx)
> + {
> + n = splay_tree_lookup (outer_ctx->variables, (splay_tree_key)decl);
> + if (n != NULL)
> + {
> + /* Ignore local variables and explicitly declared clauses. */
> + if (n->value & (GOVD_LOCAL | GOVD_EXPLICIT))
> + break;
> + else if (outer_ctx->region_type == ORT_ACC_PARALLEL)
> + {
> + /* Remove firstprivate and make it a copy map. */
> + n->value &= ~GOVD_FIRSTPRIVATE;
> + n->value |= GOVD_MAP;
> + }
> + }
> + else if (outer_ctx->region_type == ORT_ACC_PARALLEL)
> + {
> + splay_tree_insert (outer_ctx->variables, (splay_tree_key)decl,
> + GOVD_MAP | GOVD_SEEN);
> + break;
> + }
> + outer_ctx = outer_ctx->outer_context;
> + }
> + }
> }
I suppose we can also run into this for ORT_ACC_KERNELS (if not, should
mark/document that with gcc_unreachable or some such); per OpenACC 2.0a,
2.5.2 Kernels Construct, "a scalar variable referenced in the kernels
construct that does not appear in a data clause for the construct or any
enclosing data construct will be treated as if it appeared in a copy
clause", so we should assert that this already is a GOVD_MAP.
Can we also run into this for ORT_ACC_DATA and ORT_ACC_HOST_DATA, but
nothing needs to be done for these?
As I argued before, I'm a big fan of making things explicit (even more so
in the hairy gimilification code...), make it easy for the reader, and
thus document (assertions, gcc_unreachable) all the expected and
unexpected cases in the code, that is, write this similar to (untested):
n = splay_tree_lookup (outer_ctx->variables, (splay_tree_key)decl);
if (n != NULL)
{
/* Ignore local variables and explicitly declared clauses. */
if (n->value & (GOVD_LOCAL | GOVD_EXPLICIT))
break; [TODO: really break here; again regarding OpenACC nested parallelism?]
else if (outer_ctx->region_type == ORT_ACC_PARALLEL)
{
/* Remove firstprivate and make it a copy map. */
n->value &= ~GOVD_FIRSTPRIVATE;
n->value |= GOVD_MAP;
}
else if (outer_ctx->region_type == ORT_ACC_KERNELS)
{
/* Per OpenACC 2.0a, 2.5.2 Kernels Construct, "a scalar variable [text from above] */
gcc_assert (!(n->value & GOVD_FIRSTPRIVATE)
&& (n->value & GOVD_MAP));
}
else if (outer_ctx->region_type == ORT_ACC_DATA
|| outer_ctx->region_type == ORT_ACC_HOST_DATA)
;
else
gcc_unreachable ();
}
else if (outer_ctx->region_type == ORT_ACC_PARALLEL)
{
splay_tree_insert (outer_ctx->variables, (splay_tree_key)decl,
GOVD_MAP | GOVD_SEEN);
break; [TODO: really break here; again regarding OpenACC nested parallelism?]
}
else [similar to above]
outer_ctx = outer_ctx->outer_context;
Grüße
Thomas