This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][parloops] Modify parloops to allow code generation on SESE regions
- From: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>
- To: Antoniu <antoniu dot pop at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 23 Apr 2008 17:35:00 +0200
- Subject: Re: [PATCH][parloops] Modify parloops to allow code generation on SESE regions
- References: <1b8617c60804230807m42b0c74cu17fd32958b637a80@mail.gmail.com>
Hi,
> Index: gcc/dominance.c
> ===================================================================
> --- gcc/dominance.c (revision 134558)
> +++ gcc/dominance.c (working copy)
> @@ -693,6 +693,15 @@ free_dominance_info (enum cdi_direction
> dom_computed[dir_index] = DOM_NONE;
> }
>
> +/* Force the re-evaluation of the dominance information for the
> + direction DIR. */
> +void
> +recalculate_dominance_info (enum cdi_direction dir)
> +{
> + free_dominance_info (dir);
> + calculate_dominance_info (dir);
> +}
> +
Why is this needed? You seem to use this only for post-dominators, and
postdominators do not seem to be used anywhere in the code (except in
asserts, which IMHO is not good enough reason to compute them)?
> Index: gcc/tree-ssa-loop-ivopts.c
> ===================================================================
> --- gcc/tree-ssa-loop-ivopts.c (revision 134558)
> +++ gcc/tree-ssa-loop-ivopts.c (working copy)
> @@ -1258,6 +1258,41 @@ find_interesting_uses_cond (struct ivopt
> record_use (data, cond_p, civ, stmt, USE_COMPARE);
> }
>
> +/* Returns true if expression EXPR is not defined between ENTRY and
> + EXIT, i.e. if all its operands are defined outside of the region. */
> +
> +bool
> +expr_invariant_in_region_p (edge entry, edge exit, tree expr)
Move this function to tree-parloops.c and make it static (it does not seem
to make sense to have it placed in tree-ssa-loop-ivopts.c).
> + gather_blocks_in_sese_region_exclusive (entry_bb, exit_bb, &body);
> +
> + for (i = 0; VEC_iterate (basic_block, body, i, bb); i++)
> + for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
> + eliminate_local_variables_stmt (entry, bsi_stmt (bsi),
> + decl_address);
add VEC_free (body);
> @@ -1156,7 +1176,7 @@ separate_decls_in_loop (struct loop *loo
> *arg_struct = NULL;
> *new_arg_struct = NULL;
> }
> - else
> + else if (reduction_list)
> {
> /* Create the type for the structure to store the ssa names to. */
> type = lang_hooks.types.make_type (RECORD_TYPE);
> @@ -1195,11 +1215,37 @@ separate_decls_in_loop (struct loop *loo
> htab_traverse (reduction_list, create_stores_for_reduction,
> ld_st_data);
> clsn_data.load = make_ssa_name (nvar, NULL_TREE);
> - clsn_data.load_bb = single_dom_exit (loop)->dest;
> + clsn_data.load_bb = exit->dest;
> clsn_data.store = ld_st_data->store;
> create_final_loads_for_reduction (reduction_list, &clsn_data);
> }
> }
> + else
> + {
> + /* Create the type for the structure to store the ssa names to. */
> + type = lang_hooks.types.make_type (RECORD_TYPE);
> + type_name = build_decl (TYPE_DECL, create_tmp_var_name (".paral_data"),
> + type);
> + TYPE_NAME (type) = type_name;
> +
> + htab_traverse (name_copies, add_field_for_name, type);
> + layout_type (type);
> +
> + /* Create the loads and stores. */
> + *arg_struct = create_tmp_var (type, ".paral_data_store");
> + add_referenced_var (*arg_struct);
> + nvar = create_tmp_var (build_pointer_type (type), ".paral_data_load");
> + add_referenced_var (nvar);
> + *new_arg_struct = make_ssa_name (nvar, NULL_TREE);
> +
> + ld_st_data->store = *arg_struct;
> + ld_st_data->load = *new_arg_struct;
> + ld_st_data->store_bb = bb0;
> + ld_st_data->load_bb = bb1;
> +
> + htab_traverse (name_copies, create_loads_and_stores_for_name,
> + ld_st_data);
> + }
this duplicates a lot of code; add the if (reduction_list) to the
one or two places in the else block where it is needed instead.
> @@ -1698,6 +1745,7 @@ gen_parallel_loop (struct loop *loop, ht
>
> /* Base all the induction variables in LOOP on a single control one. */
> canonicalize_loop_ivs (loop, reduction_list, nit);
> + update_ssa (TODO_update_ssa);
Why do you need to update_ssa here?
> Index: gcc/tree-cfg.c
> ===================================================================
> --- gcc/tree-cfg.c (revision 134558)
> +++ gcc/tree-cfg.c (working copy)
> @@ -5527,6 +5527,59 @@ gather_blocks_in_sese_region (basic_bloc
> }
> }
>
> +/* Add all the blocks dominated by ENTRY to the array BBS_P. Stop
> + adding blocks when the dominator traversal reaches EXIT. This
> + function silently assumes that ENTRY strictly dominates EXIT. This
> + function differs from gather_blocks_in_sese_region in that it will
> + not include either ENTRY or EXIT in the results. */
> +
> +void
> +gather_blocks_in_sese_region_exclusive (basic_block entry, basic_block exit,
> + VEC(basic_block,heap) **bbs_p)
It would probably be better to avoid introducing this basically duplicate functionality,
use gather_blocks_in_sese_region instead, and just skip the entry and exit blocks
where necessary.
> +/* Return nonzero if basic block BB belongs to the SESE region between
> + ENTRY and EXIT. */
> +bool
> +bb_inside_sese_region_p (basic_block entry, basic_block exit,
> + const_basic_block bb)
> +{
this seems to be equivalent to "entry dominates bb and exit does not"?
I.e., could be implemented using dominated_by_p, instead of cfg
traversal?
Zdenek