[autovect][patch]Eliminate the need for variable renaming in the vectorizer
Dorit Naishlos
DORIT@il.ibm.com
Mon Dec 6 19:42:00 GMT 2004
Thanks.
If ok with you, I'll commit it tomorrow with the following changes to
update_vuses_to_preheader:
- combined two nested if stmts into one (I think that's what the coding
conventions say).
- used IS_EMPTY_STMT instead of the check for NOP_EXPR.
- removed the 'gcc_assert (j >= 0)' check cause it's done inside
PHI_ARG_DEF.
- added an assert that if an update of a vuse is needed, it is successfully
carried out.
*** tree-vectorizer.c.keith2 Mon Dec 6 11:55:46 2004
--- tree-vectorizer.c Mon Dec 6 16:52:48 2004
*************** update_vuses_to_preheader (tree stmt, st
*** 1109,1137 ****
tree ssa_name = VUSE_OP (vuses, i);
tree def_stmt = SSA_NAME_DEF_STMT (ssa_name);
tree name_var = SSA_NAME_VAR (ssa_name);
/* For a use before any definitions, def_stmt is a NOP_EXPR. */
! if (TREE_CODE (def_stmt) != NOP_EXPR)
{
/* If the block containing the statement defining the SSA_NAME
is in the loop then it's necessary to find the definition
outside the loop using the PHI nodes of the header. */
! basic_block bb = bb_for_stmt (def_stmt);
! if (flow_bb_inside_loop_p (loop, bb))
! {
! tree phi;
! for (phi = phi_nodes (header_bb); phi; phi = TREE_CHAIN (phi))
! {
! if (SSA_NAME_VAR (PHI_RESULT (phi)) == name_var)
! {
! int j = phi_arg_from_edge (phi, preheader_e);
! gcc_assert (j >= 0);
! SET_VUSE_OP (vuses, i, PHI_ARG_DEF (phi, j));
! break;
! }
! }
}
! }
}
}
--- 1109,1138 ----
tree ssa_name = VUSE_OP (vuses, i);
tree def_stmt = SSA_NAME_DEF_STMT (ssa_name);
tree name_var = SSA_NAME_VAR (ssa_name);
+ basic_block bb = bb_for_stmt (def_stmt);
/* For a use before any definitions, def_stmt is a NOP_EXPR. */
! if (!IS_EMPTY_STMT (def_stmt)
! && flow_bb_inside_loop_p (loop, bb))
{
/* If the block containing the statement defining the SSA_NAME
is in the loop then it's necessary to find the definition
outside the loop using the PHI nodes of the header. */
! tree phi;
! bool updated = false;
!
! for (phi = phi_nodes (header_bb); phi; phi = TREE_CHAIN (phi))
! {
! if (SSA_NAME_VAR (PHI_RESULT (phi)) == name_var)
! {
! int j = phi_arg_from_edge (phi, preheader_e);
! SET_VUSE_OP (vuses, i, PHI_ARG_DEF (phi, j));
! updated = true;
! break;
! }
}
! gcc_assert (updated);
! }
}
}
Bootstrapped and Tested (including SPEC) on powerpc-darwin.
dorit
Keith Besaw
<kbesaw@us.ibm.co To: Dorit Naishlos/Haifa/IBM@IBMIL
m> cc: gcc-patches@gcc.gnu.org
Subject: Re: [autovect][patch]Eliminate the need for variable renaming in the vectorizer
03/12/2004 07:15
Revised patch to incorporate Dorit's suggestions.
Dorit Naishlos <DORIT@il.ibm.com> wrote on 12/01/2004 07:54:21 AM:
>
>
>
>
> Thanks, Keith.
>
> A few comments:
>
> --> vect_transform_loop: use 'bool' for the new argument
> ('need_loop_closed_rewrite') instead of 'int'.
>
> --> the places that set '*need_loop_closed_rewrite = 1;'
> can be marked with '/* FORNOW. */'.
> Currently, the peeling scheme in the vectorizer does not maintain
> loop-closed-ssa-form. When I merge the revised peeling scheme from
mainline
> to autovect branch, I will also make it preserve loop-closed form, and
> together with your patch we could avoid calling the renamer entirely in
the
> vectorizer.
need_loop_closed_rewrite is changed to bool.
>
> --> 'need_loop_closed_rewrite' should also be set to '1' for the case
when
> the loop bound is known but doesn't divide by the vectorization factor
(not
> only the unknown-loop-bound case).
Set need_loop_closed_rewrite to true for this case also.
>
> --> in:
> > + int num_args = PHI_NUM_ARGS (phi);
> > + for (j=0; j<num_args; j++)
> > + {
> > + edge phi_edge = PHI_ARG_EDGE (phi, j);
> > + if (!flow_bb_inside_loop_p (loop, phi_edge->src))
> > + {
> > + SET_VUSE_OP (vuses, i, PHI_ARG_DEF (phi, j));
> > + break;
> > + }
> > + }
> Can we use the following instead of the above loop:
> e = loop_preheader_edge (loop);
> phi_arg_from_edge (phi, e);
> ?
Replaced for j loop as requested.
>
> --> question: why does the following update concern may-defs only, and
not
> also must-defs?
> > + /* Copy the V_MAY_DEFS representing the aliasing of the original
> array
> > + element's definition to the vector's definition then update the
> > + defining statement. The original is being deleted so the same
> > + SSA_NAMEs can be used. */
> > + copy_virtual_operands (*vec_stmt, stmt);
> > + v_may_defs = STMT_V_MAY_DEF_OPS (*vec_stmt);
> > + nv_may_defs = NUM_V_MAY_DEFS (v_may_defs);
> > +
> > + for (i = 0; i < nv_may_defs; i++)
> > + {
> > + tree ssa_name = V_MAY_DEF_RESULT (v_may_defs, i);
> > + SSA_NAME_DEF_STMT (ssa_name) = *vec_stmt;
> > + }
>
As far as I can tell there will be no must-defs. An array element
reference
or pointer reference to an array element will only generate a may-def
because
only a portion of the object is being accessed.
>
> --> about the description of the new function
'update_vuses_to_preheader':
>
> > + It's possible to vectorize a loop even though an SSA_NAME from a
> VUSE
> > + appears to be defined in a V_MAY_DEF in another statement in a
loop.
> > + Once
> ^^^^ (One?)
Fixed
>
> > + such case is when the VUSE is at the dereference of a
__restricted__
> > + pointer in a load and the V_MAY_DEF
> ^ (is)
>
Fixed
> > + at the dereference of a different
> > + __restricted__ pointer in a store. Vectorization may result in
> > + copy_virtual_uses being called to copy the problem
> ^^^^^^^ (problematic?)
Changed
>
> > + VUSE to a new statement
> > + statement
> ^^^^^^^^^ (duplication)
Fixed
>
> > + that is being inserted in the loop preheader. This procedure
> > + is called to change the SSA_NAME in the new statement's VUSE from
> the
> > + SSA_NAME updated in the loop to the related SSA_NAME available on
> the
> > + path entering the loop. */
> > +
>
> We can also add the following to the function description:
> "
> When this function is called, we have the following situation:
>
> # vuse <name1>
> S1: vload
> do {
> # name1 = phi < name0 , name2>
>
> # vuse <name1>
> S2: vload
>
> # name2 = vdef <name1>
> S3: vstore
>
> }while...
>
> Stmt S1 was created in the loop preheader block as part of
misaligned-load
> handling. This function fixes the name of the vuse of S1 from 'name1' to
> 'name0'.
> "
The above additional commentary has been added to the function
description.
>
> Let me know if its ok to commit your patch with the changes above.
>
> thanks!
>
> dorit
>
>
>
>
>
> Keith Besaw
> <kbesaw@us.ibm.com To: gcc-
> patches@gcc.gnu.org
> > cc: Dorit
> Naishlos/Haifa/IBM@IBMIL
> Sent by: Subject: [autovect]
> [patch]Eliminate the need for variable renaming in the
> gcc-patches-owner@ vectorizer
> gcc.gnu.org
>
>
> 30/11/2004 10:13
>
>
>
>
>
> The vectorizer currently requires a call to rewrite_into_ssa to rename
the
> virtual
> references associated with array element references that are replaced by
> vector references. This patch eliminates the need for the renaming.
> Vector
> loads in most cases use the same VUSE SSA_NAME as in the array element
> load they replace. Vector stores use the same V_MAY_DEF as the array
> element store they replace with an update to the defining statement.
>
> Tested on ppc
> No differences in make check or in SPEC.
>
> OK for autovect branch?
>
> Keith.
>
>
> Changelog:
> * tree-vectorizer.c (update_vuses_to_preheader): new, fixup
VUSEs
> for
> loads added to loop preheader block.
> (vect_create_data_ref_ptr): remove code which sets the bitmap
> vars_to_rename
> (vectorizable_store): call copy_virtual_operands to copy
> V_MAY_DEFs
> from array element store to vector store and then update the
> statements
> defining the virtual SSA_NAMES.
> (vectorizable_load): call copy_virtual_operands to copy VUSEs
from
> array element load to vector load. Call
update_vuses_to_preheader
> for
> loads inserted into loop preheader block.
> (vect_do_peeling_for_alignment): add need_loop_closed_rewrite
> parm.
> (vect_transform_loop): add need_loop_closed_rewrite_parm.
> (vectorize_loops): eliminate call to rewrite_into_ssa, only call
> rewrite_into_loop_closed_ssa if tree_duplicate_loop_to_edge has
> been
> called in vect_do_peeling_for_alignment or vect_transform_loop.
> *rs6000.c(altivect_init_builtins): add the "const" attribute to
> __builtin_altivec_mask_for_store and
> __builtin_altivec_mask_for_load.
>
>
>
> 123456789012356789012345678901234567890123456789012345678901234567890
>
>
> #### vect.11-29.diff has been removed from this note on December 01,
2004
> by Dorit Naishlos
>
#### vect.12-3.diff has been removed from this note on December 06, 2004 by
Dorit Naishlos
More information about the Gcc-patches
mailing list