This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Use "oacc kernels" attribute for OpenACC kernels
- From: Thomas Schwinge <thomas at codesourcery dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Cc: Tom de Vries <tom at codesourcery dot com>
- Date: Mon, 8 May 2017 21:29:28 +0200
- Subject: Re: Use "oacc kernels" attribute for OpenACC kernels
- Authentication-results: sourceware.org; auth=none
- References: <568A922B.8010700@acm.org> <56A63A0A.2070304@acm.org> <56829657.9060106@acm.org> <87zip3jw2x.fsf@hertz.schwinge.homeip.net> <20160125150914.GK3017@tucnak.redhat.com> <87twfb9yyk.fsf@kepler.schwinge.homeip.net> <87popo7hmj.fsf@kepler.schwinge.homeip.net>
Hi!
On Thu, 4 Aug 2016 16:07:00 +0200, I wrote:
> Ping.
>
> On Wed, 27 Jul 2016 12:06:59 +0200, I wrote:
> > On Mon, 25 Jan 2016 16:09:14 +0100, Jakub Jelinek <jakub@redhat.com> wrote:
> > > On Mon, Jan 25, 2016 at 10:06:50AM -0500, Nathan Sidwell wrote:
> > > > On 01/04/16 10:39, Nathan Sidwell wrote:
> > > > >There's currently no robust predicate to determine whether an oacc offload
> > > > >function is for a kernels region (as opposed to a parallel region).
> > > > >[...]
> > > > >
> > > > >This patch marks TREE_PUBLIC on the offload attribute values, to note kernels
> > > > >regions, and adds a predicate to check that. [...]
> > > > >
> > > > >Using these predicates improves the dump output of the openacc device lowering
> > > > >pass too.
> >
> > I just submitted a patch adding "Test cases to check OpenACC offloaded
> > function's attributes and classification",
(Pinged in
<877f1r1duw.fsf@hertz.schwinge.homeip.net">http://mid.mail-archive.com/877f1r1duw.fsf@hertz.schwinge.homeip.net>.)
> > to actually check the dump output of "oaccdevlow" -- it works. ;-)
> >
> > > > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00092.html
> > > > ping?
> > >
> > > Ok, thanks.
> >
> > It's conceptually and code-wise simpler to just use a "oacc kernels"
> > attribute for that. (And, that will make another patch I'm working on
> > less convoluted.)
> >
> > I'm open to suggestions if there is a better place to set the "oacc
> > kernels" attribute -- I put it into expand_omp_target, where another
> > special thing for GF_OMP_TARGET_KIND_OACC_KERNELS is already being done,
> > and before "rewriting" GF_OMP_TARGET_KIND_OACC_KERNELS (and
> > GF_OMP_TARGET_KIND_OACC_PARALLEL) into BUILT_IN_GOACC_PARALLEL. My
> > reasoning for not setting the attribute earlier (like, in the front
> > ends), is that at that point in/before expand_omp_target, we still have
> > the distrinction between OACC_PARALLEL/OACC_KERNELS (tree codes), and
> > later GF_OMP_TARGET_KIND_OACC_PARALLEL/GF_OMP_TARGET_KIND_OACC_KERNELS
> > (GIMPLE_OMP_TARGET subcodes). Another question/possibly cleanup of
> > course might be to actually do set the "oacc kernels" attribute in the
> > front end and merge OACC_KERNELS into OACC_PARALLEL, and
> > GF_OMP_TARGET_KIND_OACC_KERNELS into GF_OMP_TARGET_KIND_OACC_PARALLEL?
> >
> > But anyway, as a first step: OK for trunk?
commit fac5c3214f58812881635d3fb1e1751446d4b660
Author: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon May 8 21:24:46 2017 +0200
Use "oacc kernels" attribute for OpenACC kernels
gcc/
* omp-expand.c (expand_omp_target)
<GF_OMP_TARGET_KIND_OACC_KERNELS>: Set "oacc kernels" attribute.
* omp-general.c (oacc_set_fn_attrib): Remove is_kernel formal
parameter. Adjust all users.
(oacc_fn_attrib_kernels_p): Remove function.
(execute_oacc_device_lower): Look for "oacc kernels" attribute
instead of calling oacc_fn_attrib_kernels_p.
* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
* tree-parloops.c (create_parallel_loop): If oacc_kernels_p,
assert "oacc kernels" attribute is set.
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
---
gcc/omp-expand.c | 16 +++++++++-----
gcc/omp-general.c | 18 ++--------------
gcc/omp-general.h | 4 +---
gcc/omp-offload.c | 25 +++++++++++-----------
.../goacc/classify-kernels-unparallelized.c | 8 +++----
.../c-c++-common/goacc/classify-kernels.c | 8 +++----
.../c-c++-common/goacc/classify-parallel.c | 2 +-
.../c-c++-common/goacc/classify-routine.c | 2 +-
.../goacc/classify-kernels-unparallelized.f95 | 8 +++----
.../gfortran.dg/goacc/classify-kernels.f95 | 8 +++----
.../gfortran.dg/goacc/classify-parallel.f95 | 2 +-
.../gfortran.dg/goacc/classify-routine.f95 | 2 +-
gcc/tree-parloops.c | 5 ++++-
gcc/tree-ssa-loop.c | 5 +----
14 files changed, 52 insertions(+), 61 deletions(-)
diff --git gcc/omp-expand.c gcc/omp-expand.c
index 5c48b78..405c60e 100644
--- gcc/omp-expand.c
+++ gcc/omp-expand.c
@@ -7083,7 +7083,16 @@ expand_omp_target (struct omp_region *region)
exit_bb = region->exit;
if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
- mark_loops_in_oacc_kernels_region (region->entry, region->exit);
+ {
+ mark_loops_in_oacc_kernels_region (region->entry, region->exit);
+
+ /* Further down, both OpenACC kernels and OpenACC parallel constructs
+ will be mappted to BUILT_IN_GOACC_PARALLEL, and to distinguish the
+ two, there is an "oacc kernels" attribute set for OpenACC kernels. */
+ DECL_ATTRIBUTES (child_fn)
+ = tree_cons (get_identifier ("oacc kernels"),
+ NULL_TREE, DECL_ATTRIBUTES (child_fn));
+ }
if (offloaded)
{
@@ -7266,7 +7275,6 @@ expand_omp_target (struct omp_region *region)
enum built_in_function start_ix;
location_t clause_loc;
unsigned int flags_i = 0;
- bool oacc_kernels_p = false;
switch (gimple_omp_target_kind (entry_stmt))
{
@@ -7287,8 +7295,6 @@ expand_omp_target (struct omp_region *region)
flags_i |= GOMP_TARGET_FLAG_EXIT_DATA;
break;
case GF_OMP_TARGET_KIND_OACC_KERNELS:
- oacc_kernels_p = true;
- /* FALLTHROUGH */
case GF_OMP_TARGET_KIND_OACC_PARALLEL:
start_ix = BUILT_IN_GOACC_PARALLEL;
break;
@@ -7451,7 +7457,7 @@ expand_omp_target (struct omp_region *region)
break;
case BUILT_IN_GOACC_PARALLEL:
{
- oacc_set_fn_attrib (child_fn, clauses, oacc_kernels_p, &args);
+ oacc_set_fn_attrib (child_fn, clauses, &args);
tagging = true;
}
/* FALLTHRU */
diff --git gcc/omp-general.c gcc/omp-general.c
index 3f9aec8..9a5ed88 100644
--- gcc/omp-general.c
+++ gcc/omp-general.c
@@ -515,11 +515,10 @@ oacc_replace_fn_attrib (tree fn, tree dims)
/* Scan CLAUSES for launch dimensions and attach them to the oacc
function attribute. Push any that are non-constant onto the ARGS
- list, along with an appropriate GOMP_LAUNCH_DIM tag. IS_KERNEL is
- true, if these are for a kernels region offload function. */
+ list, along with an appropriate GOMP_LAUNCH_DIM tag. */
void
-oacc_set_fn_attrib (tree fn, tree clauses, bool is_kernel, vec<tree> *args)
+oacc_set_fn_attrib (tree fn, tree clauses, vec<tree> *args)
{
/* Must match GOMP_DIM ordering. */
static const omp_clause_code ids[]
@@ -545,9 +544,6 @@ oacc_set_fn_attrib (tree fn, tree clauses, bool is_kernel, vec<tree> *args)
non_const |= GOMP_DIM_MASK (ix);
}
attr = tree_cons (NULL_TREE, dim, attr);
- /* Note kernelness with TREE_PUBLIC. */
- if (is_kernel)
- TREE_PUBLIC (attr) = 1;
}
oacc_replace_fn_attrib (fn, attr);
@@ -616,16 +612,6 @@ oacc_get_fn_attrib (tree fn)
return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn));
}
-/* Return true if this oacc fn attrib is for a kernels offload
- region. We use the TREE_PUBLIC flag of each dimension -- only
- need to check the first one. */
-
-bool
-oacc_fn_attrib_kernels_p (tree attr)
-{
- return TREE_PUBLIC (TREE_VALUE (attr));
-}
-
/* Extract an oacc execution dimension from FN. FN must be an
offloaded function or routine that has already had its execution
dimensions lowered to the target-specific values. */
diff --git gcc/omp-general.h gcc/omp-general.h
index 3cf7fce..d28eb4b 100644
--- gcc/omp-general.h
+++ gcc/omp-general.h
@@ -82,11 +82,9 @@ extern int omp_max_vf (void);
extern int omp_max_simt_vf (void);
extern tree oacc_launch_pack (unsigned code, tree device, unsigned op);
extern void oacc_replace_fn_attrib (tree fn, tree dims);
-extern void oacc_set_fn_attrib (tree fn, tree clauses, bool is_kernel,
- vec<tree> *args);
+extern void oacc_set_fn_attrib (tree fn, tree clauses, vec<tree> *args);
extern tree oacc_build_routine_dims (tree clauses);
extern tree oacc_get_fn_attrib (tree fn);
-extern bool oacc_fn_attrib_kernels_p (tree attr);
extern int oacc_get_fn_dim_size (tree fn, int axis);
extern int oacc_get_ifn_dim_arg (const gimple *stmt);
diff --git gcc/omp-offload.c gcc/omp-offload.c
index beeeb71..15a1cd3 100644
--- gcc/omp-offload.c
+++ gcc/omp-offload.c
@@ -619,7 +619,6 @@ oacc_validate_dims (tree fn, tree attrs, int *dims, int level, unsigned used)
tree purpose[GOMP_DIM_MAX];
unsigned ix;
tree pos = TREE_VALUE (attrs);
- bool is_kernel = oacc_fn_attrib_kernels_p (attrs);
/* Make sure the attribute creator attached the dimension
information. */
@@ -666,13 +665,9 @@ oacc_validate_dims (tree fn, tree attrs, int *dims, int level, unsigned used)
/* Replace the attribute with new values. */
pos = NULL_TREE;
for (ix = GOMP_DIM_MAX; ix--;)
- {
- pos = tree_cons (purpose[ix],
- build_int_cst (integer_type_node, dims[ix]),
- pos);
- if (is_kernel)
- TREE_PUBLIC (pos) = 1;
- }
+ pos = tree_cons (purpose[ix],
+ build_int_cst (integer_type_node, dims[ix]),
+ pos);
oacc_replace_fn_attrib (fn, pos);
}
}
@@ -1455,10 +1450,16 @@ execute_oacc_device_lower ()
int fn_level = oacc_fn_attrib_level (attrs);
if (dump_file)
- fprintf (dump_file, oacc_fn_attrib_kernels_p (attrs)
- ? "Function is kernels offload\n"
- : fn_level < 0 ? "Function is parallel offload\n"
- : "Function is routine level %d\n", fn_level);
+ {
+ if (lookup_attribute ("oacc kernels",
+ DECL_ATTRIBUTES (current_function_decl)))
+ fprintf (dump_file, "Function is OpenACC kernels offload\n");
+ else if (fn_level < 0)
+ fprintf (dump_file, "Function is OpenACC parallel offload\n");
+ else
+ fprintf (dump_file, "Function is OpenACC routine level %d\n",
+ fn_level);
+ }
unsigned outer_mask = fn_level >= 0 ? GOMP_DIM_MASK (fn_level) - 1 : 0;
unsigned used_mask = oacc_loop_partition (loops, outer_mask);
diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
index a76351c..70ff428 100644
--- gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
+++ gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
@@ -24,16 +24,16 @@ void KERNELS ()
}
/* Check the offloaded function's attributes.
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } } */
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } */
/* Check that exactly one OpenACC kernels construct is analyzed, and that it
can't be parallelized.
{ dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } }
{ dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" } } */
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 "oaccdevlow" } }
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC kernels offload" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels.c gcc/testsuite/c-c++-common/goacc/classify-kernels.c
index 199a73e..c8b0fda 100644
--- gcc/testsuite/c-c++-common/goacc/classify-kernels.c
+++ gcc/testsuite/c-c++-common/goacc/classify-kernels.c
@@ -20,16 +20,16 @@ void KERNELS ()
}
/* Check the offloaded function's attributes.
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } } */
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } */
/* Check that exactly one OpenACC kernels construct is analyzed, and that it
can be parallelized.
{ dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } }
{ dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 "oaccdevlow" } }
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC kernels offload" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
diff --git gcc/testsuite/c-c++-common/goacc/classify-parallel.c gcc/testsuite/c-c++-common/goacc/classify-parallel.c
index 9d48c1b..4f97301 100644
--- gcc/testsuite/c-c++-common/goacc/classify-parallel.c
+++ gcc/testsuite/c-c++-common/goacc/classify-parallel.c
@@ -23,6 +23,6 @@ void PARALLEL ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is parallel offload" 1 "oaccdevlow" } }
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
diff --git gcc/testsuite/c-c++-common/goacc/classify-routine.c gcc/testsuite/c-c++-common/goacc/classify-routine.c
index 72b02c2..fd89fc1 100644
--- gcc/testsuite/c-c++-common/goacc/classify-routine.c
+++ gcc/testsuite/c-c++-common/goacc/classify-routine.c
@@ -25,6 +25,6 @@ void ROUTINE ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is routine level 1" 1 "oaccdevlow" } }
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
{ dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target, oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } } */
diff --git gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
index fd46d0d..9887d35 100644
--- gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
+++ gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
@@ -26,16 +26,16 @@ program main
end program main
! Check the offloaded function's attributes.
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } }
! Check that exactly one OpenACC kernels construct is analyzed, and that it
! can't be parallelized.
! { dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } }
! { dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" } }
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC kernels offload" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
diff --git gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
index 053d27c..69c89a9 100644
--- gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
+++ gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
@@ -22,16 +22,16 @@ program main
end program main
! Check the offloaded function's attributes.
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } }
! Check that exactly one OpenACC kernels construct is analyzed, and that it
! can be parallelized.
! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } }
! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } }
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC kernels offload" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
diff --git gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95 gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95
index 087ff48..e215c79 100644
--- gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95
+++ gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95
@@ -25,6 +25,6 @@ end program main
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is parallel offload" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
diff --git gcc/testsuite/gfortran.dg/goacc/classify-routine.f95 gcc/testsuite/gfortran.dg/goacc/classify-routine.f95
index 319d767..4ca4067 100644
--- gcc/testsuite/gfortran.dg/goacc/classify-routine.f95
+++ gcc/testsuite/gfortran.dg/goacc/classify-routine.f95
@@ -24,6 +24,6 @@ end subroutine ROUTINE
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is routine level 1" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target, oacc function \\(0 0, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } }
diff --git gcc/tree-parloops.c gcc/tree-parloops.c
index 7393011..6ce9d84 100644
--- gcc/tree-parloops.c
+++ gcc/tree-parloops.c
@@ -2043,10 +2043,13 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
/* Prepare the GIMPLE_OMP_PARALLEL statement. */
if (oacc_kernels_p)
{
+ gcc_checking_assert (lookup_attribute ("oacc kernels",
+ DECL_ATTRIBUTES (cfun->decl)));
+
tree clause = build_omp_clause (loc, OMP_CLAUSE_NUM_GANGS);
OMP_CLAUSE_NUM_GANGS_EXPR (clause)
= build_int_cst (integer_type_node, n_threads);
- oacc_set_fn_attrib (cfun->decl, clause, true, NULL);
+ oacc_set_fn_attrib (cfun->decl, clause, NULL);
}
else
{
diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c
index 8b25b41..10c43f3 100644
--- gcc/tree-ssa-loop.c
+++ gcc/tree-ssa-loop.c
@@ -152,10 +152,7 @@ gate_oacc_kernels (function *fn)
if (!flag_openacc)
return false;
- tree oacc_function_attr = oacc_get_fn_attrib (fn->decl);
- if (oacc_function_attr == NULL_TREE)
- return false;
- if (!oacc_fn_attrib_kernels_p (oacc_function_attr))
+ if (!lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn->decl)))
return false;
struct loop *loop;
Grüße
Thomas