[PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels
Richard Biener
rguenther@suse.de
Wed Apr 22 07:39:00 GMT 2015
On Tue, 21 Apr 2015, Thomas Schwinge wrote:
> Hi!
>
> On Tue, 25 Nov 2014 12:27:34 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote:
> > On 15-11-14 18:21, Tom de Vries wrote:
> > > On 15-11-14 13:14, Tom de Vries wrote:
> > >> Hi,
> > >>
> > >> I'm submitting a patch series with initial support for the oacc kernels
> > >> directive.
> > >>
> > >> The patch series uses pass_parallelize_loops to implement parallelization of
> > >> loops in the oacc kernels region.
> > >>
> > >> The patch series consists of these 8 patches:
> > >> ...
> > >> 1 Expand oacc kernels after pass_build_ealias
> > >> 2 Add pass_oacc_kernels
> > >> 3 Add pass_ch_oacc_kernels to pass_oacc_kernels
> > >> 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels
> > >> 5 Add pass_loop_im to pass_oacc_kernels
> > >> 6 Add pass_ccp to pass_oacc_kernels
> > >> 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels
> > >> 8 Do simple omp lowering for no address taken var
> > >> ...
> > >
> > > This patch adds a pass_ch_oacc_kernels to the pass group pass_oacc_kernels.
> > >
> > > The idea is that pass_parallelize_loops only deals with loops for which the
> > > header has been copied, so the easiest way to meet that requirement when running
> > > pass_parallelize_loops in group pass_oacc_kernels, is to run pass_ch as a part
> > > of pass_oacc_kernels.
> > >
> > > We define a seperate pass pass_ch_oacc_kernels, to leave all loops that aren't
> > > part of a kernels region alone.
> > >
> >
> > Updated for moving pass_oacc_kernels down past pass_fre in the pass list.
> >
> > Bootstrapped and reg-tested as before.
> >
> > OK for trunk?
>
> Committed to gomp-4_0-branch in r222281:
>
> commit 58c33a7965c379b55b549d50e3b79b2252bcc876
> Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date: Tue Apr 21 19:48:16 2015 +0000
>
> Add pass_ch_oacc_kernels to pass_oacc_kernels
>
> gcc/
> * omp-low.c (loop_in_oacc_kernels_region_p): New function.
> * omp-low.h (loop_in_oacc_kernels_region_p): Declare.
> * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
> * tree-pass.h (make_pass_ch_oacc_kernels): Declare
> * tree-ssa-loop-ch.c: Include omp-low.h.
> (pass_ch_execute): Declare.
> (pass_ch::execute): Factor out ...
> (pass_ch_execute): ... this new function. If handling oacc kernels,
> skip loops that are not in oacc kernels region.
> (pass_ch_oacc_kernels::execute):
> (pass_data_ch_oacc_kernels): New pass_data.
> (class pass_ch_oacc_kernels): New pass.
> (pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
> function.
>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@222281 138bc75d-0d04-0410-961f-82ee72b054a4
> ---
> gcc/ChangeLog.gomp | 15 ++++++++
> gcc/omp-low.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++
> gcc/omp-low.h | 2 ++
> gcc/passes.def | 1 +
> gcc/tree-pass.h | 1 +
> gcc/tree-ssa-loop-ch.c | 59 +++++++++++++++++++++++++++++--
> 6 files changed, 167 insertions(+), 2 deletions(-)
>
> diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
> index 8a53ad8..d00c5e0 100644
> --- gcc/ChangeLog.gomp
> +++ gcc/ChangeLog.gomp
> @@ -1,5 +1,20 @@
> 2015-04-21 Tom de Vries <tom@codesourcery.com>
>
> + * omp-low.c (loop_in_oacc_kernels_region_p): New function.
> + * omp-low.h (loop_in_oacc_kernels_region_p): Declare.
> + * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
> + * tree-pass.h (make_pass_ch_oacc_kernels): Declare
> + * tree-ssa-loop-ch.c: Include omp-low.h.
> + (pass_ch_execute): Declare.
> + (pass_ch::execute): Factor out ...
> + (pass_ch_execute): ... this new function. If handling oacc kernels,
> + skip loops that are not in oacc kernels region.
> + (pass_ch_oacc_kernels::execute):
> + (pass_data_ch_oacc_kernels): New pass_data.
> + (class pass_ch_oacc_kernels): New pass.
> + (pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
> + function.
> +
> * passes.def: Add pass group pass_oacc_kernels.
> * tree-pass.h (make_pass_oacc_kernels): Declare.
> * tree-ssa-loop.c (gate_oacc_kernels): New static function.
> diff --git gcc/omp-low.c gcc/omp-low.c
> index 16d9a5e..1b03ae6 100644
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -13920,4 +13920,95 @@ gimple_stmt_omp_data_i_init_p (gimple stmt)
> SSA_OP_DEF);
> }
>
> +/* Return true if LOOP is inside a kernels region. */
> +
> +bool
> +loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry,
> + basic_block *region_exit)
Ehm. So why not simply add a flag to struct loop instead and set it
during OMP region parsing/lowering?
It's also very odd that you disable transforms on OMP regions but at
the same time do all the OMP processing _after_ those transforms.
Something feels backward here.
Richard.
> +{
> + bitmap excludes_bitmap = BITMAP_GGC_ALLOC ();
> + bitmap region_bitmap = BITMAP_GGC_ALLOC ();
> + bitmap_clear (region_bitmap);
> +
> + if (region_entry != NULL)
> + *region_entry = NULL;
> + if (region_exit != NULL)
> + *region_exit = NULL;
> +
> + basic_block bb;
> + gimple last;
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + if (bitmap_bit_p (region_bitmap, bb->index))
> + continue;
> +
> + last = last_stmt (bb);
> + if (!last)
> + continue;
> +
> + if (gimple_code (last) != GIMPLE_OMP_TARGET
> + || (gimple_omp_target_kind (last) != GF_OMP_TARGET_KIND_OACC_KERNELS))
> + continue;
> +
> + bitmap_clear (excludes_bitmap);
> + bitmap_set_bit (excludes_bitmap, bb->index);
> +
> + vec<basic_block> dominated
> + = get_all_dominated_blocks (CDI_DOMINATORS, bb);
> +
> + unsigned di;
> + basic_block dom;
> +
> + basic_block end_region = NULL;
> + FOR_EACH_VEC_ELT (dominated, di, dom)
> + {
> + if (dom == bb)
> + continue;
> +
> + last = last_stmt (dom);
> + if (!last)
> + continue;
> +
> + if (gimple_code (last) != GIMPLE_OMP_RETURN)
> + continue;
> +
> + if (end_region == NULL
> + || dominated_by_p (CDI_DOMINATORS, end_region, dom))
> + end_region = dom;
> + }
> +
> + if (end_region == NULL)
> + {
> + gimple kernels = last_stmt (bb);
> + fatal_error (gimple_location (kernels),
> + "End of kernel region unreachable");
> + }
> +
> + vec<basic_block> excludes
> + = get_all_dominated_blocks (CDI_DOMINATORS, end_region);
> +
> + unsigned di2;
> + basic_block exclude;
> +
> + FOR_EACH_VEC_ELT (excludes, di2, exclude)
> + if (exclude != end_region)
> + bitmap_set_bit (excludes_bitmap, exclude->index);
> +
> + FOR_EACH_VEC_ELT (dominated, di, dom)
> + if (!bitmap_bit_p (excludes_bitmap, dom->index))
> + bitmap_set_bit (region_bitmap, dom->index);
> +
> + if (bitmap_bit_p (region_bitmap, loop->header->index))
> + {
> + if (region_entry != NULL)
> + *region_entry = bb;
> + if (region_exit != NULL)
> + *region_exit = end_region;
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> #include "gt-omp-low.h"
> diff --git gcc/omp-low.h gcc/omp-low.h
> index 3d30c3b..ae63c9f 100644
> --- gcc/omp-low.h
> +++ gcc/omp-low.h
> @@ -29,6 +29,8 @@ extern tree omp_reduction_init (tree, tree);
> extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
> extern void omp_finish_file (void);
> extern bool gimple_stmt_omp_data_i_init_p (gimple);
> +extern bool loop_in_oacc_kernels_region_p (struct loop *, basic_block *,
> + basic_block *);
>
> extern GTY(()) vec<tree, va_gc> *offload_funcs;
> extern GTY(()) vec<tree, va_gc> *offload_vars;
> diff --git gcc/passes.def gcc/passes.def
> index 854c5b8..5cdbc87 100644
> --- gcc/passes.def
> +++ gcc/passes.def
> @@ -90,6 +90,7 @@ along with GCC; see the file COPYING3. If not see
> function. */
> NEXT_PASS (pass_oacc_kernels);
> PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
> + NEXT_PASS (pass_ch_oacc_kernels);
> NEXT_PASS (pass_expand_omp_ssa);
> POP_INSERT_PASSES ()
> NEXT_PASS (pass_merge_phi);
> diff --git gcc/tree-pass.h gcc/tree-pass.h
> index 35778f2..321229a 100644
> --- gcc/tree-pass.h
> +++ gcc/tree-pass.h
> @@ -379,6 +379,7 @@ extern gimple_opt_pass *make_pass_loop_prefetch (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_ch_oacc_kernels (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);
> diff --git gcc/tree-ssa-loop-ch.c gcc/tree-ssa-loop-ch.c
> index d759de7..5f24bcb 100644
> --- gcc/tree-ssa-loop-ch.c
> +++ gcc/tree-ssa-loop-ch.c
> @@ -54,12 +54,15 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-inline.h"
> #include "flags.h"
> #include "tree-ssa-threadedge.h"
> +#include "omp-low.h"
>
> /* Duplicates headers of loops if they are small enough, so that the statements
> in the loop body are always executed when the loop is entered. This
> increases effectiveness of code motion optimizations, and reduces the need
> for loop preconditioning. */
>
> +static unsigned int pass_ch_execute (function *, bool);
> +
> /* Check whether we should duplicate HEADER of LOOP. At most *LIMIT
> instructions should be duplicated, limit is decreased by the actual
> amount. */
> @@ -178,6 +181,14 @@ public:
> unsigned int
> pass_ch::execute (function *fun)
> {
> + return pass_ch_execute (fun, false);
> +}
> +
> +} // anon namespace
> +
> +static unsigned int
> +pass_ch_execute (function *fun, bool oacc_kernels_p)
> +{
> struct loop *loop;
> basic_block header;
> edge exit, entry;
> @@ -211,6 +222,10 @@ pass_ch::execute (function *fun)
> if (do_while_loop_p (loop))
> continue;
>
> + if (oacc_kernels_p
> + && !loop_in_oacc_kernels_region_p (loop, NULL, NULL))
> + continue;
> +
> /* Iterate the header copying up to limit; this takes care of the cases
> like while (a && b) {...}, where we want to have both of the conditions
> copied. TODO -- handle while (a || b) - like cases, by not requiring
> @@ -301,10 +316,50 @@ pass_ch::execute (function *fun)
> return 0;
> }
>
> -} // anon namespace
> -
> gimple_opt_pass *
> make_pass_ch (gcc::context *ctxt)
> {
> return new pass_ch (ctxt);
> }
> +
> +namespace {
> +
> +const pass_data pass_data_ch_oacc_kernels =
> +{
> + GIMPLE_PASS, /* type */
> + "ch_oacc_kernels", /* name */
> + OPTGROUP_LOOP, /* optinfo_flags */
> + TV_TREE_CH, /* tv_id */
> + ( PROP_cfg | PROP_ssa ), /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + TODO_cleanup_cfg, /* todo_flags_finish */
> +};
> +
> + class pass_ch_oacc_kernels : public gimple_opt_pass
> +{
> +public:
> + pass_ch_oacc_kernels (gcc::context *ctxt)
> + : gimple_opt_pass (pass_data_ch_oacc_kernels, ctxt)
> + {}
> +
> + /* opt_pass methods: */
> + virtual bool gate (function *) { return true; }
> + virtual unsigned int execute (function *);
> +
> +}; // class pass_ch_oacc_kernels
> +
> +unsigned int
> +pass_ch_oacc_kernels::execute (function *fun)
> +{
> + return pass_ch_execute (fun, true);
> +}
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_ch_oacc_kernels (gcc::context *ctxt)
> +{
> + return new pass_ch_oacc_kernels (ctxt);
> +}
>
>
> GrüÃe,
> Thomas
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
More information about the Gcc-patches
mailing list