This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, 5/16] Add in_oacc_kernels_region in struct loop
- From: Tom de Vries <Tom_deVries at mentor dot com>
- To: Richard Biener <rguenther at suse dot de>
- Cc: "gcc-patches at gnu dot org" <gcc-patches at gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Mon, 16 Nov 2015 12:39:01 +0100
- Subject: Re: [PATCH, 5/16] Add in_oacc_kernels_region in struct loop
- Authentication-results: sourceware.org; auth=none
- References: <5640BD31 dot 2060602 at mentor dot com> <5640CA57 dot 7090007 at mentor dot com> <alpine dot LSU dot 2 dot 11 dot 1511111153440 dot 4884 at t29 dot fhfr dot qr>
On 11/11/15 11:55, Richard Biener wrote:
On Mon, 9 Nov 2015, Tom de Vries wrote:
On 09/11/15 16:35, Tom de Vries wrote:
Hi,
this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.
The patch series contains these patches:
1 Insert new exit block only when needed in
transform_to_exit_first_loop_alt
2 Make create_parallel_loop return void
3 Ignore reduction clause on kernels directive
4 Implement -foffload-alias
5 Add in_oacc_kernels_region in struct loop
6 Add pass_oacc_kernels
7 Add pass_dominator_oacc_kernels
8 Add pass_ch_oacc_kernels
9 Add pass_parallelize_loops_oacc_kernels
10 Add pass_oacc_kernels pass group in passes.def
11 Update testcases after adding kernels pass group
12 Handle acc loop directive
13 Add c-c++-common/goacc/kernels-*.c
14 Add gfortran.dg/goacc/kernels-*.f95
15 Add libgomp.oacc-c-c++-common/kernels-*.c
16 Add libgomp.oacc-fortran/kernels-*.f95
The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.
Bootstrapped and reg-tested on x86_64.
Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
I'll post the individual patches in reply to this message.
this patch adds and initializes the field in_oacc_kernels_region field in
struct loop.
The field is used to signal to subsequent passes that we're dealing with a
loop in a kernels region that we're trying parallelize.
Note that we do not parallelize kernels regions with more than one loop nest.
[ In general, kernels regions with more than one loop nest should be split up
into seperate kernels regions, but that's not supported atm. ]
I think mark_loops_in_oacc_kernels_region can be greatly simplified.
Both region entry and exit should have the same ->loop_father (a SESE
region). Then you can just walk that loops inner (and their sibling)
loops checking their header domination relation with the region entry
exit (only necessary for direct inner loops).
Updated patch to use the loops structure. Atm I'm also skipping loops
containing sibling loops, since I have no test-cases for that yet.
Thanks,
- Tom
Add in_oacc_kernels_region in struct loop
2015-11-09 Tom de Vries <tom@codesourcery.com>
* cfgloop.h (struct loop): Add in_oacc_kernels_region field.
* omp-low.c (mark_loops_in_oacc_kernels_region): New function.
(expand_omp_target): Call mark_loops_in_oacc_kernels_region.
---
gcc/cfgloop.h | 3 +++
gcc/omp-low.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 6af6893..ee73bf9 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -191,6 +191,9 @@ struct GTY ((chain_next ("%h.next"))) loop {
/* True if we should try harder to vectorize this loop. */
bool force_vectorize;
+ /* True if the loop is part of an oacc kernels region. */
+ bool in_oacc_kernels_region;
+
/* For SIMD loops, this is a unique identifier of the loop, referenced
by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE
builtins. */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5f76434..fba7bbd 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12450,6 +12450,46 @@ get_oacc_ifn_dim_arg (const gimple *stmt)
return (int) axis;
}
+/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending
+ at REGION_EXIT. */
+
+static void
+mark_loops_in_oacc_kernels_region (basic_block region_entry,
+ basic_block region_exit)
+{
+ struct loop *outer = region_entry->loop_father;
+ gcc_assert (region_exit == NULL || outer == region_exit->loop_father);
+
+ /* Don't parallelize the kernels region if it contains more than one outer
+ loop. */
+ unsigned int nr_outer_loops = 0;
+ struct loop *single_outer;
+ for (struct loop *loop = outer->inner; loop != NULL; loop = loop->next)
+ {
+ gcc_assert (loop_outer (loop) == outer);
+
+ if (!dominated_by_p (CDI_DOMINATORS, loop->header, region_entry))
+ continue;
+
+ if (region_exit != NULL
+ && dominated_by_p (CDI_DOMINATORS, loop->header, region_exit))
+ continue;
+
+ nr_outer_loops++;
+ single_outer = loop;
+ }
+ if (nr_outer_loops != 1)
+ return;
+
+ for (struct loop *loop = single_outer->inner; loop != NULL; loop = loop->inner)
+ if (loop->next)
+ return;
+
+ /* Mark the loops in the region. */
+ for (struct loop *loop = single_outer; loop != NULL; loop = loop->inner)
+ loop->in_oacc_kernels_region = true;
+}
+
/* Expand the GIMPLE_OMP_TARGET starting at REGION. */
static void
@@ -12505,6 +12545,9 @@ expand_omp_target (struct omp_region *region)
entry_bb = region->entry;
exit_bb = region->exit;
+ if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
+ mark_loops_in_oacc_kernels_region (region->entry, region->exit);
+
if (offloaded)
{
unsigned srcidx, dstidx, num;