This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][PR67476] Add param parloops-schedule


On 14/09/15 16:28, Tom de Vries wrote:
On 14/09/15 11:09, Jakub Jelinek wrote:
On Fri, Sep 11, 2015 at 03:28:01PM +0200, Tom de Vries wrote:
On 11/09/15 12:57, Jakub Jelinek wrote:
On Fri, Sep 11, 2015 at 12:55:00PM +0200, Tom de Vries wrote:
Hi,

this patch adds a param parloops-schedule=<0-4>, which sets the
omp schedule
for loops paralellized by parloops.

The <0-4> maps onto <static|dynamic|guided|auto|runtime>.

Bootstrapped and reg-tested on x86_64.

OK for trunk?
I don't really like it, the mapping of the integers to the enum values
is non-obvious and hard to remember.
Perhaps add support for enumeration params if you want this instead?


This patch adds handling of a DEFPARAMENUM macro, which is similar to
the
DEFPARAM macro, but allows the values to be named.

So the definition of param parloop-schedule becomes:
...
DEFPARAMENUM PARAM_PARLOOPS_SCHEDULE,
              "parloops-schedule",
              "Schedule type of omp schedule for loops parallelized by "
              "parloops (static, dynamic, guided, auto, runtime)",
              0, 0, 4, "static", "dynamic", "guided", "auto", "runtime")
...
[ I'll repost the original patch containing this update. ]

OK for trunk if x86_64 bootstrap and reg-test succeeds?

That still allows numeric arguments for the param, which is IMHO
undesirable.  If it is enum kind, only the enum values should be
accepted.

Fixed.

Also, it would be nice if params.h in that case would define an enum with
the values like
PARAM_PARLOOPS_SCHEDULE_KIND_{static,dynamic,guided,auto,runtime}, so use
values not wrapped in ""s and only in a macro or generator make both
enums and string array out of that.

Done, there's now a file params-enum.h containing these enums.

There is also the question if we can use __VA_ARGS__, isn't that C99 or
C++11 and later feature?  I see gengtype.h and ipa-icf-gimple.h use
that too, so maybe yes, but am not sure.


I've removed the use of variadic macros, meaning we now use
DEFPARAMENUM5 instead of DEFPARAMENUM.

Also, I've remove the min/max arguments in DEFPARAMENUM5.

And I've ensured that the default is now specified as a string rather
than an integer.

So the new definition of PARAM_PARLOOPS_SCHEDULE looks like:

DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
               "parloops-schedule",
               "Schedule type of omp schedule for loops parallelized by "
               "parloops (static, dynamic, guided, auto, runtime)",
               static,
               static, dynamic, guided, auto, runtime)

[ Again, I'll repost the original patch containing this update. ]

This patch adds param parloops-schedule (and now uses the enum associated with the param).

OK for trunk, if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom

Add param parloops-schedule

2015-09-10  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* doc/invoke.texi (@item parloops-schedule): New item.
	* omp-low.c (expand_omp_for_generic): Handle simple latch.  Add missing
	phis.  Handle original loop.
	* params.def (PARAM_PARLOOPS_SCHEDULE): New DEFPARAMENUM5.
	* tree-parloops.c: Include params-enum.h.
	(create_parallel_loop): Handle PARAM_PARLOOPS_SCHEDULE.

	* testsuite/libgomp.c/autopar-3.c: New test.
	* testsuite/libgomp.c/autopar-4.c: New test.
	* testsuite/libgomp.c/autopar-5.c: New test.
	* testsuite/libgomp.c/autopar-6.c: New test.
	* testsuite/libgomp.c/autopar-7.c: New test.
	* testsuite/libgomp.c/autopar-8.c: New test.
---
 gcc/doc/invoke.texi                     |  4 +++
 gcc/omp-low.c                           | 57 +++++++++++++++++++++++++++++++--
 gcc/params.def                          | 12 +++++++
 gcc/tree-parloops.c                     | 26 ++++++++++++++-
 libgomp/testsuite/libgomp.c/autopar-3.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-4.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-5.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-6.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-7.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-8.c |  4 +++
 10 files changed, 119 insertions(+), 4 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-6.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-7.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-8.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 76e5e29..2221795 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11005,6 +11005,10 @@ automaton.  The default is 50.
 Chunk size of omp schedule for loops parallelized by parloops.  The default
 is 0.
 
+@item parloops-schedule
+Schedule type of omp schedule for loops parallelized by parloops (static,
+dynamic, guided, auto, runtime).  The default is static.
+
 @end table
 @end table
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 88a5149..4f0498b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -239,6 +239,7 @@ static vec<omp_context *> taskreg_contexts;
 
 static void scan_omp (gimple_seq *, omp_context *);
 static tree scan_omp_1_op (tree *, int *, void *);
+static gphi *find_phi_with_arg_on_edge (tree, edge);
 
 #define WALK_SUBSTMTS  \
     case GIMPLE_BIND: \
@@ -6155,7 +6156,9 @@ expand_omp_for_generic (struct omp_region *region,
   if (!broken_loop)
     {
       l2_bb = create_empty_bb (cont_bb);
-      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l1_bb);
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l1_bb
+		  || (single_succ_edge (BRANCH_EDGE (cont_bb)->dest)->dest
+		      == l1_bb));
       gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
     }
   else
@@ -6429,8 +6432,12 @@ expand_omp_for_generic (struct omp_region *region,
       remove_edge (e);
 
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
-      add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
+      if (e == NULL)
+	{
+	  e = BRANCH_EDGE (cont_bb);
+	  gcc_assert (single_succ (e->dest) == l1_bb);
+	}
       if (gimple_omp_for_combined_p (fd->for_stmt))
 	{
 	  remove_edge (e);
@@ -6454,7 +6461,45 @@ expand_omp_for_generic (struct omp_region *region,
 	  e->flags = EDGE_FALLTHRU;
 	}
       make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
+    }
+
+
+  if (gimple_in_ssa_p (cfun))
+    {
+      gphi_iterator psi;
+
+      for (psi = gsi_start_phis (l3_bb); !gsi_end_p (psi); gsi_next (&psi))
+	{
+	  source_location locus;
+	  gphi *nphi;
+	  gphi *exit_phi = psi.phi ();
+
+	  edge l2_to_l3 = find_edge (l2_bb, l3_bb);
+	  tree exit_res = PHI_ARG_DEF_FROM_EDGE (exit_phi, l2_to_l3);
 
+	  basic_block latch = BRANCH_EDGE (cont_bb)->dest;
+	  edge latch_to_l1 = find_edge (latch, l1_bb);
+	  gphi *inner_phi = find_phi_with_arg_on_edge (exit_res, latch_to_l1);
+
+	  tree t = gimple_phi_result (exit_phi);
+	  tree new_res = copy_ssa_name (t, NULL);
+	  nphi = create_phi_node (new_res, l0_bb);
+
+	  edge l0_to_l1 = find_edge (l0_bb, l1_bb);
+	  t = PHI_ARG_DEF_FROM_EDGE (inner_phi, l0_to_l1);
+	  locus = gimple_phi_arg_location_from_edge (inner_phi, l0_to_l1);
+	  edge entry_to_l0 = find_edge (entry_bb, l0_bb);
+	  add_phi_arg (nphi, t, entry_to_l0, locus);
+
+	  edge l2_to_l0 = find_edge (l2_bb, l0_bb);
+	  add_phi_arg (nphi, exit_res, l2_to_l0, UNKNOWN_LOCATION);
+
+	  add_phi_arg (inner_phi, new_res, l0_to_l1, UNKNOWN_LOCATION);
+	};
+    }
+
+  if (!broken_loop)
+    {
       set_immediate_dominator (CDI_DOMINATORS, l2_bb,
 			       recompute_dominator (CDI_DOMINATORS, l2_bb));
       set_immediate_dominator (CDI_DOMINATORS, l3_bb,
@@ -6464,14 +6509,20 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
+      struct loop *loop = l1_bb->loop_father;
+      add_bb_to_loop (l2_bb, entry_bb->loop_father);
+
       struct loop *outer_loop = alloc_loop ();
       outer_loop->header = l0_bb;
       outer_loop->latch = l2_bb;
       add_loop (outer_loop, l0_bb->loop_father);
 
+      if (loop != entry_bb->loop_father)
+	return;
+
       if (!gimple_omp_for_combined_p (fd->for_stmt))
 	{
-	  struct loop *loop = alloc_loop ();
+	  loop = alloc_loop ();
 	  loop->header = l1_bb;
 	  /* The loop may have multiple latches.  */
 	  add_loop (loop, outer_loop);
diff --git a/gcc/params.def b/gcc/params.def
index 34e2025..76699d4 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -35,6 +35,11 @@ along with GCC; see the file COPYING3.  If not see
      - The maximum acceptable value for the parameter (if greater than
      the minimum).
 
+   The DEFPARAMENUM<N> macro is similar, but instead of the minumum and maximum
+   arguments, it contains a list of <N> allowed strings, corresponding to
+   integer values 0..<N>-1.  Note that the default argument needs to be
+   specified as one of the allowed strings, rather than an integer value.
+
    Be sure to add an entry to invoke.texi summarizing the parameter.  */
 
 /* When branch is predicted to be taken with probability lower than this
@@ -1145,6 +1150,13 @@ DEFPARAM (PARAM_PARLOOPS_CHUNK_SIZE,
 	  "parloops-chunk-size",
 	  "Chunk size of omp schedule for loops parallelized by parloops",
 	  0, 0, 0)
+
+DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
+	       "parloops-schedule",
+	       "Schedule type of omp schedule for loops parallelized by "
+	       "parloops (static, dynamic, guided, auto, runtime)",
+	       static,
+	       static, dynamic, guided, auto, runtime)
 /*
 
 Local variables:
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index c164121..4df631e 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "tree-ssa.h"
 #include "params.h"
+#include "params-enum.h"
 
 /* This pass tries to distribute iterations of loops into several threads.
    The implementation is straightforward -- for each loop we test whether its
@@ -2092,8 +2093,31 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
   gimple_cond_set_lhs (cond_stmt, cvar_base);
   type = TREE_TYPE (cvar);
   t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
-  OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
   int chunk_size = PARAM_VALUE (PARAM_PARLOOPS_CHUNK_SIZE);
+  enum PARAM_PARLOOPS_SCHEDULE_KIND schedule_type \
+    = (enum PARAM_PARLOOPS_SCHEDULE_KIND) PARAM_VALUE (PARAM_PARLOOPS_SCHEDULE);
+  switch (schedule_type)
+    {
+    case PARAM_PARLOOPS_SCHEDULE_KIND_static:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_dynamic:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_DYNAMIC;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_guided:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_GUIDED;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_auto:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_AUTO;
+      chunk_size = 0;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_runtime:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+      chunk_size = 0;
+      break;
+    default:
+      gcc_unreachable ();
+    }
   if (chunk_size != 0)
     OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t)
       = build_int_cst (integer_type_node, chunk_size);
diff --git a/libgomp/testsuite/libgomp.c/autopar-3.c b/libgomp/testsuite/libgomp.c/autopar-3.c
new file mode 100644
index 0000000..1c25a44
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-3.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=dynamic" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-4.c b/libgomp/testsuite/libgomp.c/autopar-4.c
new file mode 100644
index 0000000..78c77d9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-4.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=dynamic --param parloops-chunk-size=100" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-5.c b/libgomp/testsuite/libgomp.c/autopar-5.c
new file mode 100644
index 0000000..f0acb20
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-5.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=guided" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-6.c b/libgomp/testsuite/libgomp.c/autopar-6.c
new file mode 100644
index 0000000..f6e723e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-6.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=guided --param parloops-chunk-size=100" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-7.c b/libgomp/testsuite/libgomp.c/autopar-7.c
new file mode 100644
index 0000000..5f15508f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-7.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=auto" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-8.c b/libgomp/testsuite/libgomp.c/autopar-8.c
new file mode 100644
index 0000000..6099e9f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-8.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=runtime" } */
+
+#include "autopar-1.c"
-- 
1.9.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]