This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Loop-aware SLP 3/5
- From: Ira Rosen <IRAR at il dot ibm dot com>
- To: Dorit Nuzman <DORIT at il dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Sun, 9 Sep 2007 10:10:51 +0300
- Subject: Re: [patch] Loop-aware SLP 3/5
Dorit Nuzman/Haifa/IBM wrote on 18/08/2007 21:00:07:
> Ira Rosen/Haifa/IBM wrote on 14/08/2007 16:11:07:
>
> > This is the third part of loop-aware SLP patch.
> >
> > This part completes the SLP support inside loops by collecting SLP
> > seeds (groups of strided stores) and scheduling of SLP instances.
> > SLP computation tree is scheduled in postorder. To vectorize a node
> > we get its vector definitions from already SLPed left and right nodes.
> >
> + if (diff != 1)
> + {
> + /* SLP of accesses with gaps is not supported. */
> + slp_impossible = true;
>
> please add a FORNOW
>
> @@ -2925,6 +2936,10 @@ vectorizable_conversion (tree stmt, bloc
> else
> ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
>
> + /* FORNOW */
> + if (slp_node)
> + ncopies = 1;
>
> can you please add a comments that explains why it is ok to
> overwrite ncopies with 1? (I guess cause the computation of ncopies
> above is not valid for the slp case (why) and that you verify before
> hand that ncopies is indeed 1 for the case of SLP)
OK.
>
> if (j == 0)
> - vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
> + if (slp_node)
> + {
> + /* Get vectorized arguments for SLP_NODE. */
> + if (!vect_get_slp_defs (slp_node, &vec_oprnds0, NULL))
> + return false;
>
> so is it now possible to fail during the transformation phase? what
> would it take to change things such that this behavior can be avoided?
I checked it, and it actually always returns true :), so I removed the
check (and the function does not return any value now).
>
> + }
> + else
> + {
> + vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
> + VEC_quick_push (tree, vec_oprnds0, vec_oprnd0);
> + }
> else
> - vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
> + {
> + /* FORNOW: unreachable with SLP. */
> + gcc_assert (!slp_node);
> +
> + vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
> + VEC_replace (tree, vec_oprnds0, 0, vec_oprnd0);
> + }
>
> I'm trying to see if we can reduce/hide the "if (slp_node)" switches
> in this code (which repeats in all the vectorizable* functions).
> Assuming we can't fail during slp-transform, we could replace the
> code above with calls to the following wrappers:
> if (j == 0)
> vect_get_vec_defs (op0, stmt, &vec_oprnds0, slp_node);
> else
> vect_get_vec_defs_for_stmt_copy (dt, &vec_oprnds0);
>
> ...which would be roughly something like the following:
>
> void
> vect_get_vec_defs_for_stmt_copy (dt, vec_oprnds)
> {
> tree vec_oprnd = VEC_pop (vec_oprnds);
> vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, vec_oprnd);
> VEC_replace (tree, vec_oprnds, 0, vec_oprnd);
> }
>
> void
> vect_get_vec_defs (op, stmt, *vec_oprnds, slp_node)
> {
> if (slp_node)
> vect_get_slp_defs (slp_node, vec_oprnds, NULL);
> else
> {
> tree vec_oprnd;
> *vec_oprnds = VEC_alloc (tree, heap, 1);
> vec_oprnd = vect_get_vec_def_for_operand (op, stmt, NULL);
> VEC_quick_push (tree, vec_oprnds, vec_oprnd);
> }
> return;
> }
>
> and similarly in all the other vectorizable* functions, e.g., replace
this:
>
> @@ -3158,15 +3206,34 @@ vectorizable_assignment (tree stmt, bloc
> vec_dest = vect_create_destination_var (scalar_dest, vectype);
>
> /* Handle use. */
> - op = GIMPLE_STMT_OPERAND (stmt, 1);
> - vec_oprnd = vect_get_vec_def_for_operand (op, stmt, NULL);
> + if (slp_node)
> + {
> + /* Get vectorized arguments for SLP_NODE. */
> + if (!vect_get_slp_defs (slp_node, &vec_oprnds, NULL))
> + return false;
> + }
> + else
> + {
> + op = GIMPLE_STMT_OPERAND (stmt, 1);
> + vec_oprnd = vect_get_vec_def_for_operand (op, stmt, NULL);
> + vec_oprnds = VEC_alloc (tree, heap, 1);
> + VEC_quick_push (tree, vec_oprnds, vec_oprnd);
> + }
>
> with a call to
> vect_get_vec_defs (op, stmt, &vec_oprnds, slp_node);
>
> etc.. What do you say? Hopefully something like the above could be
> made to work for most of the vectorizable* functions, this way or
> another (except for vectorizable_load/store probably. these require
> more thought to see how, if at all possible, to reduce/hide the "if
> slp" switches).
Done.
>
> @@ -4926,16 +5101,21 @@ vectorizable_load (tree stmt, block_stmt
> ...
> +
> + if (slp)
> + VEC_quick_push (tree, SLP_TREE_VEC_STMTS (slp_node), new_stmt);
> }
>
> + if (slp)
> + continue;
> +
> if (strided_load)
>
> can you add a comment explaining the above? and/or a more high-level
> documentation explaining the flow of vectorizable_load for the slp
> case? I got a bit lost here...
Done.
Thanks,
Ira
>
> thanks,
> dorit
>
> > Thanks,
> > Ira
> >
> > ChangeLog:
> >
> > * tree-vect-analyze.c (vect_analyze_group_access): Collect groups of
> > strided stores for further use in SLP analysis.
> > * tree-vect-transform.c (vectorizable_reduction): Don't handle SLP
> > for now.
> > (vectorizable_call): Likewise.
> > (vectorizable_conversion): Handle SLP (call vect_get_slp_defs to
> > get SLPed defs).
> > (vectorizable_assignment): Likewise.
> > (vectorizable_induction): Don't handle SLP for now.
> > (vectorizable_operation): Handle SLP (call vect_get_slp_defs to
> > get SLPed defs).
> > (vectorizable_store, vectorizable_load): Likewise.
> > (vectorizable_condition): Don't handle SLP for now.
> > (vect_transform_loop): Skip stmts without stmt_vinfo.
> > Schedule SLP instances.
> >
> > [attachment "slp-part3.txt" deleted by Dorit Nuzman/Haifa/IBM]