[gomp4.1] fold ordered depend(sink) clauses
Aldy Hernandez
aldyh@redhat.com
Fri Jul 31 02:56:00 GMT 2015
On 07/30/2015 03:01 AM, Jakub Jelinek wrote:
> On Wed, Jul 29, 2015 at 04:48:23PM -0700, Aldy Hernandez wrote:
>> @@ -7490,8 +7503,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>> == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
>> gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
>> == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
>> - gimplify_omp_ctxp->iter_vars.create (TREE_VEC_LENGTH
>> - (OMP_FOR_INIT (for_stmt)));
>> + gimplify_omp_ctxp->loop_iter_var.create (TREE_VEC_LENGTH
>> + (OMP_FOR_INIT (for_stmt)));
>> + gimplify_omp_ctxp->loop_dir.create (TREE_VEC_LENGTH
>> + (OMP_FOR_INIT (for_stmt)));
>> + gimplify_omp_ctxp->loop_const_step.create (TREE_VEC_LENGTH
>> + (OMP_FOR_INIT (for_stmt)));
>> for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
>> {
>> t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
>
> I think the above should be guarded with
> tree c = find_omp_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
> if (c && OMP_CLAUSE_ORDERED_EXPR (c))
> The most common case is that ordered(n) is not present, so we should
> optimize for that.
Done.
>
>> @@ -7501,10 +7518,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>> gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
>> || POINTER_TYPE_P (TREE_TYPE (decl)));
>> if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
>> - gimplify_omp_ctxp->iter_vars.quick_push
>> + gimplify_omp_ctxp->loop_iter_var.quick_push
>> (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
>> else
>> - gimplify_omp_ctxp->iter_vars.quick_push (decl);
>> + gimplify_omp_ctxp->loop_iter_var.quick_push (decl);
>>
>> /* Make sure the iteration variable is private. */
>> tree c = NULL_TREE;
>
> And all these etc. pushes too, simply remember is_doacross in some bool
> variable.
Not applicable. I've moved the code to omp-low, as suggested, where the
omp_for_data is available, thus simplifying everything.
>
>> @@ -8387,6 +8435,228 @@ gimplify_transaction (tree *expr_p, gimple_seq *pre_p)
>> return GS_ALL_DONE;
>> }
>>
>
> Function comment missing.
Fixed.
>
>> +static gimple
>> +gimplify_omp_ordered (tree expr, gimple_seq body)
>> +{
>
>> + The basic algorithm is to create a sink vector whose first
>> + element is the GCD of all the first elements, and whose remaining
>> + elements are the minimum of the subsequent columns.
>> +
>> + We ignore dependence vectors whose first element is zero because
>> + such dependencies are known to be executed by the same thread.
>> +
>> + ?? ^^^^ Is this only applicable for collapse(1) loops? If so, how
>> + ?? to handle collapse(N) loops where N > 1?
>
> For collapse(N) N > 1, you can't ignore first iter var with 0 offset, you can only
> ignore if N first iter vars have 0 offset.
>
> Pretty much for the purpose of the algorithm, you compute "first element"
> for collapse(N) N > 1 and ordered(5-N)
> for (iv1 = ..; iv1 < ..; iv1 += step1)
> for (iv2 = ..; iv2 < ..; iv2 += step2)
> for (iv3 = ..; iv3 < ..; iv3 += step3)
> for (iv4 = ..; iv4 < ..; iv4 += step4)
> depend(iv1 + off1, iv2 + off2, iv3 + off3, iv4 + off4)
> as: off(N) + off(N-1)*step(N) + iv(N-2)*step(N)*step(N-1)...
> (to be checked the case if the directions differ).
> So basically, you want to precompute if you'd add some loop counter
> in between loop(N) and loop(N+1) that would be initially zero and incremented
> by step(N). The GCD is then performed on this, it is compared to zero,
> and then finally split again into the several offsets.
> The "is this invalid iteration" check (is offN divisible by stepN)
> needs to be done before the merging of course.
>
> Except, now that I think, it is not that easy. Because we have another
> test for "is this invalid iteration", in particularly one performed at
> runtime on each of the depends. Say if offN is negative and loop(N)
> is forward loop (thus stepN is positive (note, for backward loop stepN
> still might be huge positive value for unsigned iterators)), the check
> would be if (ivN + offN >= initialN), for forward loop with positive offN
> if (ivN + offN < endvalueN) etc. Now, not sure if by computing the GCD and
> merging several depend clauses into one we preserve those tests or not.
Ughh... as a followup? For now I'm going to bail on collapse > 1 and
add a big FIXME.
>
> All in all, I think it might be really better to do the depend clause
> merging during omp lowering in omp-low.c, where you can call
> extract_omp_for_data and inspect the iteration variables, have the steps
> computed for you etc. That is the spot where we probably want to emit some
> GOMP_depend_sink call (and GOMP_depend_source) and prepare arguments for it.
Excellent suggestion. Done.
>
>> +void
>> +funk ()
>> +{
>> +#pragma omp parallel for ordered(2)
>> + for (i=0; i < N; i++)
>> + for (j=0; j < N; ++j)
>> + {
>> +/* We should keep the (sink:i,j-2) by virtue of it the i+0. The
>> + remaining clauses get folded with a GCD of -2 for `i' and a minimum
>> + of -1 for 'j'. */
>
> I think we shouldn't keep the useless one (sink: i, j-2) (or keep invalid
> ones).
Done.
How's this?
Aldy
-------------- next part --------------
commit 9c979c4c4cd53092affbf98c05ddd8c9a60915c7
Author: Aldy Hernandez <aldyh@redhat.com>
Date: Wed Jul 29 13:39:06 2015 -0700
* wide-int.h (wi::gcd): New.
* gimplify.c (struct gimplify_omp_ctx): Rename iter_vars to
loop_iter_var.
(delete_omp_context): Same.
(gimplify_expr): Move code handling OMP_ORDERED into...
(gimplify_omp_ordered): ...here. New.
* omp-low.c (lower_omp_ordered_clauses): New.
(lower_omp_ordered): Call lower_omp_ordered_clauses.
testsuite/
* gcc.dg/gomp/sink-fold-1.c: New.
* gcc.dg/gomp/sink-fold-2.c: New.
* gcc.dg/gomp/sink-fold-3.c: New.
* c-c++-common/gomp/sink-4.c: Look in omplower dump file instead.
* g++.dg/gomp/sink-3.C: Have sink offset be in the opposite
direction. Make variables more readable.
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 2331001..83eb6a1 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -153,7 +153,7 @@ struct gimplify_omp_ctx
splay_tree variables;
hash_set<tree> *privatized_types;
/* Iteration variables in an OMP_FOR. */
- vec<tree> iter_vars;
+ vec<tree> loop_iter_var;
location_t location;
enum omp_clause_default_kind default_kind;
enum omp_region_type region_type;
@@ -392,7 +392,7 @@ delete_omp_context (struct gimplify_omp_ctx *c)
{
splay_tree_delete (c->variables);
delete c->privatized_types;
- c->iter_vars.release ();
+ c->loop_iter_var.release ();
XDELETE (c);
}
@@ -7490,8 +7490,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
== TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
== TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
- gimplify_omp_ctxp->iter_vars.create (TREE_VEC_LENGTH
- (OMP_FOR_INIT (for_stmt)));
+
+ tree c = find_omp_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
+ bool is_doacross = false;
+ if (c && OMP_CLAUSE_ORDERED_EXPR (c))
+ {
+ is_doacross = true;
+ gimplify_omp_ctxp->loop_iter_var.create (TREE_VEC_LENGTH
+ (OMP_FOR_INIT (for_stmt)));
+ }
for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
{
t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
@@ -7500,11 +7507,14 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
gcc_assert (DECL_P (decl));
gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
|| POINTER_TYPE_P (TREE_TYPE (decl)));
- if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
- gimplify_omp_ctxp->iter_vars.quick_push
- (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
- else
- gimplify_omp_ctxp->iter_vars.quick_push (decl);
+ if (is_doacross)
+ {
+ if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
+ gimplify_omp_ctxp->loop_iter_var.quick_push
+ (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
+ else
+ gimplify_omp_ctxp->loop_iter_var.quick_push (decl);
+ }
/* Make sure the iteration variable is private. */
tree c = NULL_TREE;
@@ -8387,6 +8397,53 @@ gimplify_transaction (tree *expr_p, gimple_seq *pre_p)
return GS_ALL_DONE;
}
+/* Gimplify an OMP_ORDERED construct. EXPR is the tree version. BODY
+ is the OMP_BODY of the original EXPR (which has already been
+ gimplified so it's not present in the EXPR).
+
+ Return the gimplified GIMPLE_OMP_ORDERED tuple. */
+
+static gimple
+gimplify_omp_ordered (tree expr, gimple_seq body)
+{
+ tree c, decls;
+ int failures = 0;
+ unsigned int i;
+
+ if (gimplify_omp_ctxp)
+ for (c = OMP_ORDERED_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
+ && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
+ {
+ bool fail = false;
+ for (decls = OMP_CLAUSE_DECL (c), i = 0;
+ decls && TREE_CODE (decls) == TREE_LIST;
+ decls = TREE_CHAIN (decls), ++i)
+ if (i < gimplify_omp_ctxp->loop_iter_var.length ()
+ && TREE_VALUE (decls) != gimplify_omp_ctxp->loop_iter_var[i])
+ {
+ error_at (OMP_CLAUSE_LOCATION (c),
+ "variable %qE is not an iteration "
+ "of outermost loop %d, expected %qE",
+ TREE_VALUE (decls), i + 1,
+ gimplify_omp_ctxp->loop_iter_var[i]);
+ fail = true;
+ failures++;
+ }
+ /* Avoid being too redundant. */
+ if (!fail && i != gimplify_omp_ctxp->loop_iter_var.length ())
+ {
+ error_at (OMP_CLAUSE_LOCATION (c),
+ "number of variables in depend(sink) "
+ "clause does not match number of "
+ "iteration variables");
+ failures++;
+ }
+ }
+
+ return gimple_build_omp_ordered (body, OMP_ORDERED_CLAUSES (expr));
+}
+
/* Convert the GENERIC expression tree *EXPR_P to GIMPLE. If the
expression produces a value to be used as an operand inside a GIMPLE
statement, the value will be stored back in *EXPR_P. This value will
@@ -9200,38 +9257,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
}
break;
case OMP_ORDERED:
- if (gimplify_omp_ctxp)
- for (tree c = OMP_ORDERED_CLAUSES (*expr_p);
- c; c = OMP_CLAUSE_CHAIN (c))
- if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
- && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
- {
- unsigned int n = 0;
- bool fail = false;
- for (tree decls = OMP_CLAUSE_DECL (c);
- decls && TREE_CODE (decls) == TREE_LIST;
- decls = TREE_CHAIN (decls), ++n)
- if (n < gimplify_omp_ctxp->iter_vars.length ()
- && TREE_VALUE (decls)
- != gimplify_omp_ctxp->iter_vars[n])
- {
- error_at (OMP_CLAUSE_LOCATION (c),
- "variable %qE is not an iteration "
- "of outermost loop %d, expected %qE",
- TREE_VALUE (decls), n + 1,
- gimplify_omp_ctxp->iter_vars[n]);
- fail = true;
- }
- /* Avoid being too redundant. */
- if (!fail
- && n != gimplify_omp_ctxp->iter_vars.length ())
- error_at (OMP_CLAUSE_LOCATION (c),
- "number of variables in depend(sink) clause "
- "does not match number of iteration variables");
- }
-
- g = gimple_build_omp_ordered (body,
- OMP_ORDERED_CLAUSES (*expr_p));
+ g = gimplify_omp_ordered (*expr_p, body);
break;
case OMP_CRITICAL:
gimplify_scan_omp_clauses (&OMP_CRITICAL_CLAUSES (*expr_p),
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e7d21ea..a63bf60 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11969,6 +11969,230 @@ lower_omp_taskgroup (gimple_stmt_iterator *gsi_p, omp_context *ctx)
}
+/* Fold the OMP_ORDERED_CLAUSES for the OMP_ORDERED in STMT if possible. */
+
+static void
+lower_omp_ordered_clauses (gomp_ordered *ord_stmt, omp_context *ctx)
+{
+ struct omp_for_data fd;
+ if (!ctx->outer || gimple_code (ctx->outer->stmt) != GIMPLE_OMP_FOR)
+ return;
+
+ /* ?? This is stupid. We need to call extract_omp_for_data just
+ to get the number of ordered loops... */
+ extract_omp_for_data (as_a <gomp_for *> (ctx->outer->stmt), &fd, NULL);
+ if (!fd.ordered)
+ return;
+ struct omp_for_data_loop *loops
+ = (struct omp_for_data_loop *)
+ alloca (fd.ordered * sizeof (struct omp_for_data_loop));
+ /* ?? ...and then again to get the actual loops. */
+ extract_omp_for_data (as_a <gomp_for *> (ctx->outer->stmt), &fd, loops);
+
+ /* Canonicalize sink dependence clauses into one folded clause if
+ possible.
+
+ The basic algorithm is to create a sink vector whose first
+ element is the GCD of all the first elements, and whose remaining
+ elements are the minimum of the subsequent columns.
+
+ We ignore dependence vectors whose first element is zero because
+ such dependencies are known to be executed by the same thread.
+
+ We take into account the direction of the loop, so a minimum
+ becomes a maximum if the loop is iterating backwards. We also
+ ignore sink clauses where the loop direction is unknown, or where
+ the offsets are clearly invalid because they are not a multiple
+ of the loop increment.
+
+ For example:
+
+ for (i=0; i < N; ++i)
+ depend(sink:i-8,j-1)
+ depend(sink:i,j-2) // Completely ignored because i+0.
+ depend(sink:i-4,j+3)
+ depend(sink:i-6,j+2)
+
+ Folded clause is:
+
+ depend(sink:-gcd(8,4,6),min(-1,3,2))
+ -or-
+ depend(sink:-2,-1)
+ */
+
+ /* FIXME: Computing GCD's where the first element is zero is
+ non-trivial in the presence of collapsed loops. Do this later. */
+ if (fd.collapse > 1)
+ return;
+
+ unsigned int len = fd.ordered;
+ vec<wide_int> folded_deps;
+ folded_deps.create (len);
+ folded_deps.quick_grow_cleared (len);
+ /* Bitmap representing dimensions in the final dependency vector that
+ have been set. */
+ sbitmap folded_deps_used = sbitmap_alloc (len);
+ bitmap_clear (folded_deps_used);
+ /* TRUE if the first dimension's offset is negative. */
+ bool neg_offset_p = false;
+
+ /* ?? We need to save the original iteration variables stored in the
+ depend clauses, because those in fd.loops[].v have already been
+ gimplified. Perhaps we should use the gimplified versions. ?? */
+ tree *iter_vars = (tree *) alloca (sizeof (tree) * len);
+ memset (iter_vars, 0, sizeof (tree) * len);
+
+ tree *list_p = gimple_omp_ordered_clauses_ptr (ord_stmt);
+ tree c;
+ unsigned int i;
+ while ((c = *list_p) != NULL)
+ {
+ bool remove = false;
+
+ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
+ || OMP_CLAUSE_DEPEND_KIND (c) != OMP_CLAUSE_DEPEND_SINK)
+ goto next_ordered_clause;
+
+ tree decls;
+ for (decls = OMP_CLAUSE_DECL (c), i = 0;
+ decls && TREE_CODE (decls) == TREE_LIST;
+ decls = TREE_CHAIN (decls), ++i)
+ {
+ gcc_assert (i < len);
+
+ enum {
+ DIR_UNKNOWN,
+ DIR_FORWARD,
+ DIR_BACKWARD
+ } loop_dir;
+ switch (fd.loops[i].cond_code)
+ {
+ case LT_EXPR:
+ case LE_EXPR:
+ loop_dir = DIR_FORWARD;
+ break;
+ case GT_EXPR:
+ case GE_EXPR:
+ loop_dir = DIR_BACKWARD;
+ break;
+ default:
+ loop_dir = DIR_UNKNOWN;
+ gcc_unreachable ();
+ }
+
+ /* While the committee makes up its mind, bail if we have any
+ non-constant steps. */
+ if (TREE_CODE (fd.loops[i].step) != INTEGER_CST)
+ goto gimplify_omp_ordered_ret;
+
+ wide_int offset = TREE_PURPOSE (decls);
+ if (!iter_vars[i])
+ iter_vars[i] = TREE_VALUE (decls);
+
+ /* Ignore invalid offsets that are not multiples of the step. */
+ if (!wi::multiple_of_p
+ (wi::abs (offset), wi::abs ((wide_int) fd.loops[i].step),
+ UNSIGNED))
+ {
+ warning_at (OMP_CLAUSE_LOCATION (c), 0,
+ "ignoring sink clause with offset that is not "
+ "a multiple of the loop step");
+ remove = true;
+ goto next_ordered_clause;
+ }
+
+ /* Calculate the first dimension. The first dimension of
+ the folded dependency vector is the GCD of the first
+ elements, while ignoring any first elements whose offset
+ is 0. */
+ if (i == 0)
+ {
+ /* Ignore dependence vectors whose first dimension is 0. */
+ if (offset == 0)
+ {
+ remove = true;
+ goto next_ordered_clause;
+ }
+ else
+ {
+ neg_offset_p =
+ wi::neg_p (offset,
+ TYPE_SIGN (TREE_TYPE (TREE_PURPOSE (decls))));
+ if ((loop_dir == DIR_FORWARD && !neg_offset_p)
+ || (loop_dir == DIR_BACKWARD && neg_offset_p))
+ {
+ error_at (OMP_CLAUSE_LOCATION (c),
+ "first offset must be in opposite direction "
+ "of loop iterations");
+ goto gimplify_omp_ordered_ret;
+ }
+ /* Initialize the first time around. */
+ if (!bitmap_bit_p (folded_deps_used, 0))
+ {
+ bitmap_set_bit (folded_deps_used, 0);
+ folded_deps[0] = wi::abs (offset);
+ }
+ else
+ folded_deps[i] = wi::gcd (folded_deps[0], offset, UNSIGNED);
+ }
+ }
+ /* Calculate minimum for the remaining dimensions. */
+ else
+ {
+ if (!bitmap_bit_p (folded_deps_used, i))
+ {
+ bitmap_set_bit (folded_deps_used, i);
+ folded_deps[i] = offset;
+ }
+ else if ((loop_dir == DIR_FORWARD
+ && wi::lts_p (offset, folded_deps[i]))
+ || (loop_dir == DIR_BACKWARD
+ && wi::gts_p (offset, folded_deps[i])))
+ folded_deps[i] = offset;
+ }
+ }
+
+ remove = true;
+
+ next_ordered_clause:
+ if (remove)
+ *list_p = OMP_CLAUSE_CHAIN (c);
+ else
+ list_p = &OMP_CLAUSE_CHAIN (c);
+ }
+
+ for (i = 0; i < len; ++i)
+ if (!bitmap_bit_p (folded_deps_used, i))
+ break;
+ if (i == len)
+ {
+ if (neg_offset_p)
+ folded_deps[0] = -folded_deps[0];
+
+ tree vec = NULL;
+ i = len;
+ do
+ {
+ i--;
+ vec = tree_cons (wide_int_to_tree (TREE_TYPE (fd.loops[i].v),
+ folded_deps[i]),
+ iter_vars[i], vec);
+ }
+ while (i);
+
+ c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_DEPEND);
+ OMP_CLAUSE_DEPEND_KIND (c) = OMP_CLAUSE_DEPEND_SINK;
+ OMP_CLAUSE_DECL (c) = vec;
+ OMP_CLAUSE_CHAIN (c) = gimple_omp_ordered_clauses (ord_stmt);
+ *gimple_omp_ordered_clauses_ptr (ord_stmt) = c;
+ }
+
+ gimplify_omp_ordered_ret:
+ sbitmap_free (folded_deps_used);
+ folded_deps.release ();
+}
+
+
/* Expand code for an OpenMP ordered directive. */
static void
@@ -11979,6 +12203,8 @@ lower_omp_ordered (gimple_stmt_iterator *gsi_p, omp_context *ctx)
gcall *x;
gbind *bind;
+ lower_omp_ordered_clauses (as_a <gomp_ordered *> (stmt), ctx);
+
push_gimplify_context ();
block = make_node (BLOCK);
diff --git a/gcc/testsuite/c-c++-common/gomp/sink-4.c b/gcc/testsuite/c-c++-common/gomp/sink-4.c
index 7934de2..111178b 100644
--- a/gcc/testsuite/c-c++-common/gomp/sink-4.c
+++ b/gcc/testsuite/c-c++-common/gomp/sink-4.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-fopenmp -fdump-tree-gimple" } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
/* Test that we adjust pointer offsets for sink variables
correctly. */
@@ -22,4 +22,4 @@ funk (foo *begin, foo *end)
}
}
-/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+400\\)" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+400.\\)" 1 "omplower" } } */
diff --git a/gcc/testsuite/g++.dg/gomp/sink-3.C b/gcc/testsuite/g++.dg/gomp/sink-3.C
index 83a742e..4271d66 100644
--- a/gcc/testsuite/g++.dg/gomp/sink-3.C
+++ b/gcc/testsuite/g++.dg/gomp/sink-3.C
@@ -8,7 +8,7 @@ typedef struct {
char stuff[400];
} foo;
-foo *p, *q, *r;
+foo *end, *begin, *p;
template<int N>
void
@@ -16,7 +16,7 @@ funk ()
{
int i,j;
#pragma omp parallel for ordered(1)
- for (p=q; p < q; p--)
+ for (p=end; p > begin; p--)
{
#pragma omp ordered depend(sink:p+1)
void bar ();
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c b/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c
new file mode 100644
index 0000000..f2961b9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
+
+/* Test depend(sink) clause folding. */
+
+int i,j, N;
+
+extern void bar();
+
+void
+funk ()
+{
+#pragma omp parallel for ordered(2)
+ for (i=0; i < N; i++)
+ for (j=0; j < N; ++j)
+ {
+/* We remove the (sink:i,j-2) by virtue of it the i+0. The remaining
+ clauses get folded with a GCD of -2 for `i' and a minimum of -1 for
+ 'j'. */
+#pragma omp ordered \
+ depend(sink:i-8,j-1) \
+ depend(sink:i, j-2) \
+ depend(sink:i-4,j+3) \
+ depend(sink:i-6,j+2)
+ bar();
+#pragma omp ordered depend(source)
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "omp ordered depend\\(sink:i-2,j-1\\)" 1 "omplower" } } */
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c b/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c
new file mode 100644
index 0000000..b3b4ac7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+
+int i,j, N;
+
+extern void bar();
+
+void
+funk ()
+{
+#pragma omp parallel for ordered(2)
+ for (i=0; i < N; i += 3)
+ for (j=0; j < N; ++j)
+ {
+#pragma omp ordered depend(sink:i-8,j-1) /* { dg-warning "ignoring sink clause with offset that is not a multiple" } */
+#pragma omp ordered depend(sink:i+3,j-1) /* { dg-error "first offset must be in opposite direction" } */
+ bar();
+#pragma omp ordered depend(source)
+ }
+}
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c b/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c
new file mode 100644
index 0000000..4d6293c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
+
+/* Test that we fold sink offsets correctly while taking into account
+ pointer sizes. */
+
+typedef struct {
+ char stuff[400];
+} foo;
+
+void
+funk (foo *begin, foo *end)
+{
+ foo *p;
+#pragma omp parallel for ordered(1)
+ for (p=end; p > begin; p--)
+ {
+#pragma omp ordered depend(sink:p+2) depend(sink:p+4)
+ void bar ();
+ bar();
+#pragma omp ordered depend(source)
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+800B\\)" 1 "omplower" } } */
diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index d8f7b46..c20db61 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -514,6 +514,7 @@ namespace wi
BINARY_FUNCTION div_round (const T1 &, const T2 &, signop, bool * = 0);
BINARY_FUNCTION divmod_trunc (const T1 &, const T2 &, signop,
WI_BINARY_RESULT (T1, T2) *);
+ BINARY_FUNCTION gcd (const T1 &, const T2 &, signop = UNSIGNED);
BINARY_FUNCTION mod_trunc (const T1 &, const T2 &, signop, bool * = 0);
BINARY_FUNCTION smod_trunc (const T1 &, const T2 &);
BINARY_FUNCTION umod_trunc (const T1 &, const T2 &);
@@ -2653,6 +2654,27 @@ wi::divmod_trunc (const T1 &x, const T2 &y, signop sgn,
return quotient;
}
+/* Compute the greatest common divisor of two numbers A and B using
+ Euclid's algorithm. */
+template <typename T1, typename T2>
+inline WI_BINARY_RESULT (T1, T2)
+wi::gcd (const T1 &a, const T2 &b, signop sgn)
+{
+ T1 x, y, z;
+
+ x = wi::abs (a);
+ y = wi::abs (b);
+
+ while (gt_p (x, 0, sgn))
+ {
+ z = mod_trunc (y, x, sgn);
+ y = x;
+ x = z;
+ }
+
+ return y;
+}
+
/* Compute X / Y, rouding towards 0, and return the remainder.
Treat X and Y as having the signedness given by SGN. Indicate
in *OVERFLOW if the division overflows. */
More information about the Gcc-patches
mailing list