This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gomp4.1] fold ordered depend(sink) clauses


On Wed, Jul 29, 2015 at 04:48:23PM -0700, Aldy Hernandez wrote:
> @@ -7490,8 +7503,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>  	      == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
>    gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
>  	      == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
> -  gimplify_omp_ctxp->iter_vars.create (TREE_VEC_LENGTH
> -				       (OMP_FOR_INIT (for_stmt)));
> +  gimplify_omp_ctxp->loop_iter_var.create (TREE_VEC_LENGTH
> +					   (OMP_FOR_INIT (for_stmt)));
> +  gimplify_omp_ctxp->loop_dir.create (TREE_VEC_LENGTH
> +				      (OMP_FOR_INIT (for_stmt)));
> +  gimplify_omp_ctxp->loop_const_step.create (TREE_VEC_LENGTH
> +					     (OMP_FOR_INIT (for_stmt)));
>    for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
>      {
>        t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);

I think the above should be guarded with
  tree c = find_omp_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
  if (c && OMP_CLAUSE_ORDERED_EXPR (c))
The most common case is that ordered(n) is not present, so we should
optimize for that.

> @@ -7501,10 +7518,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>        gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
>  		  || POINTER_TYPE_P (TREE_TYPE (decl)));
>        if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
> -	gimplify_omp_ctxp->iter_vars.quick_push
> +	gimplify_omp_ctxp->loop_iter_var.quick_push
>  	  (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
>        else
> -	gimplify_omp_ctxp->iter_vars.quick_push (decl);
> +	gimplify_omp_ctxp->loop_iter_var.quick_push (decl);
>  
>        /* Make sure the iteration variable is private.  */
>        tree c = NULL_TREE;

And all these etc. pushes too, simply remember is_doacross in some bool
variable.

> @@ -8387,6 +8435,228 @@ gimplify_transaction (tree *expr_p, gimple_seq *pre_p)
>    return GS_ALL_DONE;
>  }
>  

Function comment missing.

> +static gimple
> +gimplify_omp_ordered (tree expr, gimple_seq body)
> +{

> +     The basic algorithm is to create a sink vector whose first
> +     element is the GCD of all the first elements, and whose remaining
> +     elements are the minimum of the subsequent columns.
> +
> +     We ignore dependence vectors whose first element is zero because
> +     such dependencies are known to be executed by the same thread.
> +
> +     ?? ^^^^ Is this only applicable for collapse(1) loops?  If so, how
> +     ?? to handle collapse(N) loops where N > 1?

For collapse(N) N > 1, you can't ignore first iter var with 0 offset, you can only
ignore if N first iter vars have 0 offset.

Pretty much for the purpose of the algorithm, you compute "first element"
for collapse(N) N > 1 and ordered(5-N)
	for (iv1 = ..; iv1 < ..; iv1 += step1)
	  for (iv2 = ..; iv2 < ..; iv2 += step2)
	    for (iv3 = ..; iv3 < ..; iv3 += step3)
	      for (iv4 = ..; iv4 < ..; iv4 += step4)
		depend(iv1 + off1, iv2 + off2, iv3 + off3, iv4 + off4)
as: off(N) + off(N-1)*step(N) + iv(N-2)*step(N)*step(N-1)...
(to be checked the case if the directions differ).
So basically, you want to precompute if you'd add some loop counter
in between loop(N) and loop(N+1) that would be initially zero and incremented
by step(N).  The GCD is then performed on this, it is compared to zero,
and then finally split again into the several offsets.
The "is this invalid iteration" check (is offN divisible by stepN)
needs to be done before the merging of course.

Except, now that I think, it is not that easy.  Because we have another
test for "is this invalid iteration", in particularly one performed at
runtime on each of the depends.  Say if offN is negative and loop(N)
is forward loop (thus stepN is positive (note, for backward loop stepN
still might be huge positive value for unsigned iterators)), the check
would be if (ivN + offN >= initialN), for forward loop with positive offN
if (ivN + offN < endvalueN) etc.  Now, not sure if by computing the GCD and
merging several depend clauses into one we preserve those tests or not.

All in all, I think it might be really better to do the depend clause
merging during omp lowering in omp-low.c, where you can call
extract_omp_for_data and inspect the iteration variables, have the steps
computed for you etc.  That is the spot where we probably want to emit some
GOMP_depend_sink call (and GOMP_depend_source) and prepare arguments for it.

> +void
> +funk ()
> +{
> +#pragma omp parallel for ordered(2)
> +  for (i=0; i < N; i++)
> +    for (j=0; j < N; ++j)
> +    {
> +/* We should keep the (sink:i,j-2) by virtue of it the i+0.  The
> +   remaining clauses get folded with a GCD of -2 for `i' and a minimum
> +   of -1 for 'j'.  */

I think we shouldn't keep the useless one (sink: i, j-2) (or keep invalid
ones).  Of course, if we remove all depend clauses, we need to remove the
GIMPLE_OMP_ORDERED stmt altogether.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]