This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] PR18181 - vectorizer fix





I'm resending this patch, updated to use the new phi_reverse function
(instead of introducing the new function slpeel_create_phi_node), and to
account for a part of the patch that already went in (
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg00893.html). I also tried to
break down the patch to smaller pieces - so attached here is the entire
patch, and coming up shortly is this patch broken up to 4 parts - 3 cleanup
patches that prepare the ground for the 4th patch that actually changes the
old peeling scheme and fixes pr18181.

Bootstrapped and tested on powerpc-apple-darwin (including SPEC) and
i686-pc-linux-gnu.

ok for mainline?

thanks,
dorit

Changelog:

        * tree-vectorizer.c (slpeel_tree_peel_loop_to_edge): New name for
        function previously called tree_duplicate_loop_to_edge.
        (slpeel_tree_duplicate_loop_to_edge_cfg): New name for function
        previously called tree_duplicate_loop_to_edge_cfg.
        (slpeel_update_phis_for_duplicate_loop): Prefix 'slpeel' added to
        function name.
        (slpeel_update_phi_nodes_for_guard): Likewise.
        (slpeel_make_loop_iterate_ntimes): Likewise.
        (slpeel_add_loop_guard): Likewise.
        (allocate_new_names, free_new_names): Function declaration moved to
top
        of file.
        (rename_use_op, rename_def_op): Likewise.
        (rename_variables_in_bb, rename_variables_in_loop): Likewise.
        (vect_generate_tmps_on_preheader): Function declaration moved.
        (vect_transform_for_unknown_loop_bound): Added missing function
        declaration.

        (slpeel_can_duplicate_loop_p): New name for function
        previously called verify_loop_for_duplication. All conditions
compacted
        into one compound condition. Removed debug dumps.
        (vect_analyze_loop_with_symbolic_num_of_iters): Removed. Some of
the
        functionality moved to vect_can_advance_ivs_p, and some to
        vect_analyze_loop_form.
        (vect_can_advance_ivs_p): New function. Contains functionality that
was
        taken out of vect_analyze_loop_with_symbolic_num_of_iters.
        (slpeel_tree_peel_loop_to_edge): Call slpeel_can_duplicate_loop_p.
        (vect_analyze_operations): Call vect_can_advance_ivs_p and
        slpeel_can_duplicate_loop_p.
        (vect_get_loop_niters): Added documentation.
        (vect_analyze_loop_form): Check the loop entry always - not only in
case
        of unknown loop bound. Create preheader and exit bb if necessary.
Apply
        a check that used to take place in
        vect_analyze_loop_with_symbolic_num_of_iters.
        (vectorize_loops): Call verify_loop_closed_ssa under
ENABLE_CHECKING.
        Remove redundant call to rewrite_into_loop_closed_ssa.
        (vect_compute_data_refs_alignment): Removed obsolete comment.

        (slpeel_make_loop_iterate_ntimes): Last two arguments removed.
        (slpeel_tree_peel_loop_to_edge): Call
slpeel_make_loop_iterate_ntimes
        without last two arguments. Update single_exit of loops.
        (vect_update_niters_after_peeling): Removed. Its functionality was
        moved to vect_do_peeling_for_alignment.
        (vect_do_peeling_for_loop_bound): New name for function previously
        called vect_transform_for_unknown_loop_bound.
        (vect_transform_loop_bound): Call slpeel_make_loop_iterate_ntimes
        instead of code that duplicates the same functionality.
        (vect_do_peeling_for_alignment): Functionality of
        vect_update_niters_after_peeling moved here.
        (vect_transform_loop): Unify call to vect_do_peeling_for_loop_bound
-
        previously named vect_transform_for_unknown_loop_bound - for both
known
        and unknown loop bound cases.

        (slpeel_tree_peel_loop_to_edge): Peeling scheme changed to suppoer
        uses-after-loop and to void creating flow paths that shouldn't
exist.
        (slpeel_update_phi_nodes_for_guard): Takes additional two
arguments.
        Modified to fit the new peeling scheme. Avoid quadratic behavior.
        (slpeel_add_loop_guard): Takes additional argument.
        (slpeel_verify_cfg_after_peeling): New function.
        (vect_update_ivs_after_vectorizer): Takes additional argument.
Updated
        documentation. Use 'exit-bb' instead of creating 'new-bb'.
        (rename_variables_in_bb): Don't update phis for BBs out of loop, to
fit
        the new peeling scheme.
        (copy_phi_nodes): Function removed. Its functionality moved to
        update_phis_for_duplicate_loop.
        (slpeel_update_phis_for_duplicate_loop): Functionality of
copy_phi_nodes
        moved here. Added documentation. Modified to fit the new peeling
scheme.
        (slpeel_make_loop_iterate_ntimes): Setting loop->single_exit not
not
        needed - done in slpeel_tree_peel_loop_to_edge.
        (slpeel_tree_duplicate_loop_to_edge_cfg): Debug printouts
compacted.
        (vect_do_peeling_for_loop_bound): Add documentation. Call
        slpeel_verify_cfg_after_peeling. Call
vect_update_ivs_after_vectorizer
        with additional argument.
        (vect_do_peeling_for_alignment): Call
slpeel_verify_cfg_after_peeling.

        (vect_finish_stmt_generation): Avoid 80 column oveflow.

Patch:

(See attached file: pr18181.parts1234)





                                                                                                                                 
                      Dorit                                                                                                      
                      Naishlos/Haifa/IBM        To:       gcc-patches@gcc.gnu.org                                                
                      @IBMIL                    cc:                                                                              
                      Sent by:                  Subject:  [patch] PR18181 - vectorizer fix                                       
                      gcc-patches-owner@                                                                                         
                      gcc.gnu.org                                                                                                
                                                                                                                                 
                                                                                                                                 
                      04/11/2004 16:42                                                                                           
                                                                                                                                 








Two problems in the current peeling scheme are fixed:
1) when we update ssa-names after peeling we now consider the loop exit
phis as well (relying on loop-closed ssa form), not only the loop entry
phis.
2) we don't create flow paths that should not exist: given a loop known to
iterate at-least once, the previous implementation created two loops with a
redundant path of avoiding them both, and the current fix eliminates this
path by wiring the two loops appropriately. See diagrams below. Such a path
can cause problems, e.g. when there's an invariant defined in the LOOP: if
we skip the first-loop, we reach a path where there's no def for the
invariant (see example in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18181
, and attached testcases).

So we change the current peeling scheme:

        bb_before_first_loop:
          if (guard1) GOTO bb_between_loops      /* skip first loop */
          else        GOTO first-loop
        first_loop:
          do {
          } while ...
        bb_between_loops:
          if (guard2) GOTO bb_after_second_loop /* skip second loop */
          else        GOTO second_loop
        second_loop:
          do {
          } while ...
        bb_after_second_loop:
        orig_exit_bb:


to the following peeling scheme (changes marked with [ ]):

        bb_before_first_loop:
          if (guard1) GOTO [bb_before_second_loop] /* skip first loop */
          else        GOTO first-loop
        first_loop:
          do {
          } while ...
        bb_between_loops:
          if (guard2) GOTO bb_after_second_loop  /* skip second loop */
          else        GOTO bb_before_second_loop
        [bb_before_second_loop:]        /* New labeled basic block.  */
        second_loop:
          do {
          } while ...
        bb_after_second_loop:
        orig_exit_bb:


The patch also removes two cases of quadratic behavior that are present in
the current scheme when updating phi nodes. Also took the opportunity to
make a couple of cleanups to the functions related to peeling:
1) make a clearer separation between the peeling functions and the
vectorization functions (moved around function declarations, added prefix
"slpeel" to some functions - stands for "simple loop peeling").
2) some functions were removed, and parts of their functionality moved to
other functions.

bootstrapped and tested on ppc-darwin.

ok for mainline?

thanks,
dorit

Changelog:

        * tree-vectorizer.c (tree_peel_loop_to_edge): New name for function
        previously called tree_duplicate_loop_to_edge.
        (vect_do_peeling_for_loop_bound): New name for function previously
        called vect_transform_for_unknown_loop_bound.
        (slpeel_can_duplicate_loop_p): New name for function previously
called
        verify_loop_for_duplication.
        (slpeel_verify_cfg_after_peeling): New function.
        (slpeel_create_phi_node): New function.
        (slpeel_can_duplicate_loop_p): Prefix 'slpeel' added to function
name.
        (slpeel_update_phis_for_duplicate_loop): Likewise.
        (slpeel_make_loop_iterate_ntimes): Likewise.
        (slpeel_add_loop_guard): Likewise.

        (tree_peel_loop_to_edge): Function declaration moved to top of
file.
        (tree_duplicate_loop_to_edge_cfg): Likewise.
        (allocate_new_names, free_new_names): Likewise.
        (rename_use_op, rename_def_op): Likewise.
        (rename_variables_in_bb, rename_variables_in_loop): Likewise.

        (tree_duplicate_loop_to_edge): Renamed to tree_peel_loop_to_edge.
        Peeling scheme modified.
        (update_phi_nodes_for_guard): Modified to fit the new peeling
scheme.
        Avoid quadratic behavior.
        (add_loop_guard): Renamed to slpeel_add_loop_guard.  Takes
additional
        argument dom_bb, to fit the new peeling scheme.
        (rename_variables_in_bb): Don't update phis for BBs out of loop, to
fit
        the new peeling scheme.
        (vect_update_ivs_after_vectorizer): Modified to fit the new peeling
        scheme.  Takes additional argument update_e.  Avoid quadratic
behavior.

        (vect_transform_for_unknown_loop_bound): Renamed to
        vect_peel_for_loop_bound.  Call slpeel_verify_cfg_after_peeling.
Call
        vect_update_ivs_after_vectorizer with additional argument.
        (vect_do_peeling_for_alignment): Calls
slpeel_verify_cfg_after_peeling.
        (copy_phi_nodes): Function removed. Its functionality moved to
        update_phis_for_duplicate_loop.
        (update_phis_for_duplicate_loop): Added functionality that used to
be
        in copy_phi_nodes.  Added documentation.
        (verify_loop_for_duplication): Renamed to
slpeel_can_duplicate_loop_p.
        Added check for loop->num_nodes.
        (vect_transform_loop): Unify known and unknown loop-bound cases.
        (vect_analyze_operations): Call vect_can_advance_ivs_p. Call
        slpeel_can_duplicate_loop_p.
        (vect_analyze_loop_with_symbolic_num_of_iters): Removed. Some of
the
        functionality moved to vect_can_advance_ivs_p, and some to
        vect_analyze_loop_form.
        (vect_can_advance_ivs_p): New function. Contains functionality that
was
        taken out of vect_analyze_loop_with_symbolic_num_of_iters.
        (vect_analyze_loop_form): Create a preheader/exit bb if needed.
Added
        functionality from vect_analyze_loop_with_symbolic_num_of_iter.
       (vectorize_loops): Added call to verify_loop_closed_ssa.

patch:

(See attached file: patch.nov4)

testcases:

(See attached file: vecttest.nov4.tar.gz)



#### patch.nov4 has been removed from this note on November 07, 2004 by
Dorit Naishlos
#### vecttest.nov4.tar.gz has been removed from this note on November 07,
2004 by Dorit Naishlos

Attachment: pr18181.parts1234
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]