This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[gomp4, committed] Set safelen to INT_MAX for oacc independent pragma


[ was; Re: [PATCH, gomp4] Propagate independent clause for OpenACC kernels pass ]

On 14/07/15 11:48, Jakub Jelinek wrote:
On Tue, Jul 14, 2015 at 05:35:28PM +0800, Chung-Lin Tang wrote:
The wording of OpenACC independent is more simple:
"... the independent clause tells the implementation that the iterations of this loop
are data-independent with respect to each other." -- OpenACC spec 2.7.9

I would say this implies even more relaxed conditions than OpenMP simd safelen,
essentially saying that the compiler doesn't even need dependence analysis; just
assume independence of iterations.

safelen is also saying that the compiler doesn't even need dependence
analysis.  It is just that only some transformations of the loop are ok
without dependence analysis, others need to be with dependence analysis.
Classical vectorization optimizations (instead of doing one iteration
at a time you can do up to safelen consecutive iterations together) for the
first statement in the loop, then second statement, etc. are ok without
dependence analysis, but e.g. reversing the loop and running first the last
iteration and so on up to first, or running the iterations in random orders
is not ok.

So if OpenACC independent means there are no dependencies in between
iterations, the OpenMP counterpart here is #pragma omp for simd schedule (auto)
or #pragma omp distribute parallel for simd schedule (auto).

schedule(auto) appears to correspond to the OpenACC 'auto' clause, or
what is implied in a kernels compute construct, but I'm not sure it implies
no dependencies between iterations?

By the schedule(auto) I meant that the user tells the compiler it can
parallelize the loop with whatever schedule it wants.  Other schedules are
quite well defined, if the team has that many threads, which of the thread
gets which iteration, so user could rely on a particular parallelization and
the loop iterations still could not be 100% independent.  With
schedule(auto) you say it is up to the compiler to schedule them, thus they
really have to be all independent.

Putting aside the semantic issues, as of currently safelen>0 turns on a certain amount of
vectorization code that we are not currently using (and not likely at all for nvptx).
Right now, we're just trying to pass the new flag to a kernels tree-parloops based pass.

In any case, when setting your flag you should also set safelen = INT_MAX,
as the OpenACC independent implies that you can vectorize the loop with any
vectorization factor without performing dependency analysis on the loop.
OpenACC is (hopefully) not just about PTX and most other targets will want
to vectorize such loops.


This patch sets safelen to INT_MAX for loops marked with the independent clause on the openacc loop directive.

Build and reg-tested on x86_64 with nvidia accelerator.

Committed to gomp-4_0-branch.

Thanks,
- Tom

Set safelen to INT_MAX for oacc independent pragma

2015-07-22  Tom de Vries  <tom@codesourcery.com>

	* omp-low.c (expand_omp_for): Set loop->safelen to INT_MAX if
	marked_independent.
---
 gcc/omp-low.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 0419dcd..65c6321 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -8286,6 +8286,7 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 	{
 	  struct loop *loop = region->cont->loop_father; 
 	  loop->marked_independent = true;
+	  loop->safelen = INT_MAX;
 	}
     }
   else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_SIMD)
-- 
1.9.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]