This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def
- From: Richard Biener <rguenther at suse dot de>
- To: Tom de Vries <Tom_deVries at mentor dot com>
- Cc: "gcc-patches at gnu dot org" <gcc-patches at gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Wed, 11 Nov 2015 12:02:35 +0100 (CET)
- Subject: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def
- Authentication-results: sourceware.org; auth=none
- References: <5640BD31 dot 2060602 at mentor dot com> <5640FB07 dot 6010008 at mentor dot com>
On Mon, 9 Nov 2015, Tom de Vries wrote:
> On 09/11/15 16:35, Tom de Vries wrote:
> > Hi,
> >
> > this patch series for stage1 trunk adds support to:
> > - parallelize oacc kernels regions using parloops, and
> > - map the loops onto the oacc gang dimension.
> >
> > The patch series contains these patches:
> >
> > 1 Insert new exit block only when needed in
> > transform_to_exit_first_loop_alt
> > 2 Make create_parallel_loop return void
> > 3 Ignore reduction clause on kernels directive
> > 4 Implement -foffload-alias
> > 5 Add in_oacc_kernels_region in struct loop
> > 6 Add pass_oacc_kernels
> > 7 Add pass_dominator_oacc_kernels
> > 8 Add pass_ch_oacc_kernels
> > 9 Add pass_parallelize_loops_oacc_kernels
> > 10 Add pass_oacc_kernels pass group in passes.def
> > 11 Update testcases after adding kernels pass group
> > 12 Handle acc loop directive
> > 13 Add c-c++-common/goacc/kernels-*.c
> > 14 Add gfortran.dg/goacc/kernels-*.f95
> > 15 Add libgomp.oacc-c-c++-common/kernels-*.c
> > 16 Add libgomp.oacc-fortran/kernels-*.f95
> >
> > The first 9 patches are more or less independent, but patches 10-16 are
> > intended to be committed at the same time.
> >
> > Bootstrapped and reg-tested on x86_64.
> >
> > Build and reg-tested with nvidia accelerator, in combination with a
> > patch that enables accelerator testing (which is submitted at
> > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
> >
> > I'll post the individual patches in reply to this message.
> >
>
> This patch adds the pass_oacc_kernels pass group to the pass list in
> passes.def.
>
> Note the repetition of pass_lim/pass_copy_prop. The first pair is for an inner
> loop in a loop nest, the second for an outer loop in a loop nest.
@@ -86,6 +86,27 @@ along with GCC; see the file COPYING3. If not see
/* pass_build_ealias is a dummy pass that ensures that we
execute TODO_rebuild_alias at this point. */
NEXT_PASS (pass_build_ealias);
+ /* Pass group that runs when there are oacc kernels in the
+ function. */
+ NEXT_PASS (pass_oacc_kernels);
+ PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+ NEXT_PASS (pass_dominator_oacc_kernels);
+ NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_dominator_oacc_kernels);
+ NEXT_PASS (pass_tree_loop_init);
+ NEXT_PASS (pass_lim);
+ NEXT_PASS (pass_copy_prop);
+ NEXT_PASS (pass_lim);
+ NEXT_PASS (pass_copy_prop);
iterate lim/copyprop twice?! Why's that needed?
+ NEXT_PASS (pass_scev_cprop);
What's that for? It's supposed to help removing loops - I don't
expect kernels to vanish.
+ NEXT_PASS (pass_tree_loop_done);
+ NEXT_PASS (pass_dominator_oacc_kernels);
Three times DOM? No please. I wonder why you don't run oacc_kernels
after FRE and drop the initial DOM(s).
+ NEXT_PASS (pass_dce);
+ NEXT_PASS (pass_tree_loop_init);
+ NEXT_PASS (pass_parallelize_loops_oacc_kernels);
+ NEXT_PASS (pass_expand_omp_ssa);
+ NEXT_PASS (pass_tree_loop_done);
The switches into/outof tree_loop also look odd to me, but well
(they'll be controlled by -ftree-loop-optimize)).
+ POP_INSERT_PASSES ()
Please get some more sense into this pass pipeline.
Richard.
> Thanks,
> - Tom
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)