This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, 5/16] Add in_oacc_kernels_region in struct loop


On 11/11/15 11:55, Richard Biener wrote:
On Mon, 9 Nov 2015, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:
Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

       1    Insert new exit block only when needed in
          transform_to_exit_first_loop_alt
       2    Make create_parallel_loop return void
       3    Ignore reduction clause on kernels directive
       4    Implement -foffload-alias
       5    Add in_oacc_kernels_region in struct loop
       6    Add pass_oacc_kernels
       7    Add pass_dominator_oacc_kernels
       8    Add pass_ch_oacc_kernels
       9    Add pass_parallelize_loops_oacc_kernels
      10    Add pass_oacc_kernels pass group in passes.def
      11    Update testcases after adding kernels pass group
      12    Handle acc loop directive
      13    Add c-c++-common/goacc/kernels-*.c
      14    Add gfortran.dg/goacc/kernels-*.f95
      15    Add libgomp.oacc-c-c++-common/kernels-*.c
      16    Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.

this patch adds and initializes the field in_oacc_kernels_region field in
struct loop.

The field is used to signal to subsequent passes that we're dealing with a
loop in a kernels region that we're trying parallelize.

Note that we do not parallelize kernels regions with more than one loop nest.
[ In general, kernels regions with more than one loop nest should be split up
into seperate kernels regions, but that's not supported atm. ]

I think mark_loops_in_oacc_kernels_region can be greatly simplified.

Both region entry and exit should have the same ->loop_father (a SESE
region).  Then you can just walk that loops inner (and their sibling)
loops checking their header domination relation with the region entry
exit (only necessary for direct inner loops).

Updated patch to use the loops structure. Atm I'm also skipping loops containing sibling loops, since I have no test-cases for that yet.

Thanks,
- Tom

Add in_oacc_kernels_region in struct loop

2015-11-09  Tom de Vries  <tom@codesourcery.com>

	* cfgloop.h (struct loop): Add in_oacc_kernels_region field.
	* omp-low.c (mark_loops_in_oacc_kernels_region): New function.
	(expand_omp_target): Call mark_loops_in_oacc_kernels_region.

---
 gcc/cfgloop.h |  3 +++
 gcc/omp-low.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 6af6893..ee73bf9 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -191,6 +191,9 @@ struct GTY ((chain_next ("%h.next"))) loop {
   /* True if we should try harder to vectorize this loop.  */
   bool force_vectorize;
 
+  /* True if the loop is part of an oacc kernels region.  */
+  bool in_oacc_kernels_region;
+
   /* For SIMD loops, this is a unique identifier of the loop, referenced
      by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE
      builtins.  */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5f76434..fba7bbd 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12450,6 +12450,46 @@ get_oacc_ifn_dim_arg (const gimple *stmt)
   return (int) axis;
 }
 
+/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending
+   at REGION_EXIT.  */
+
+static void
+mark_loops_in_oacc_kernels_region (basic_block region_entry,
+				   basic_block region_exit)
+{
+  struct loop *outer = region_entry->loop_father;
+  gcc_assert (region_exit == NULL || outer == region_exit->loop_father);
+
+  /* Don't parallelize the kernels region if it contains more than one outer
+     loop.  */
+  unsigned int nr_outer_loops = 0;
+  struct loop *single_outer;
+  for (struct loop *loop = outer->inner; loop != NULL; loop = loop->next)
+    {
+      gcc_assert (loop_outer (loop) == outer);
+
+      if (!dominated_by_p (CDI_DOMINATORS, loop->header, region_entry))
+	continue;
+
+      if (region_exit != NULL
+	  && dominated_by_p (CDI_DOMINATORS, loop->header, region_exit))
+	continue;
+
+      nr_outer_loops++;
+      single_outer = loop;
+    }
+  if (nr_outer_loops != 1)
+    return;
+
+  for (struct loop *loop = single_outer->inner; loop != NULL; loop = loop->inner)
+    if (loop->next)
+      return;
+
+  /* Mark the loops in the region.  */
+  for (struct loop *loop = single_outer; loop != NULL; loop = loop->inner)
+    loop->in_oacc_kernels_region = true;
+}
+
 /* Expand the GIMPLE_OMP_TARGET starting at REGION.  */
 
 static void
@@ -12505,6 +12545,9 @@ expand_omp_target (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
+    mark_loops_in_oacc_kernels_region (region->entry, region->exit);
+
   if (offloaded)
     {
       unsigned srcidx, dstidx, num;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]