This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][2/3] Vectorize inductions that are live after the loop


On Fri, May 27, 2016 at 5:12 PM, Alan Hayward <alan.hayward@arm.com> wrote:
>
> On 27/05/2016 12:41, "Richard Biener" <richard.guenther@gmail.com> wrote:
>
>>On Fri, May 27, 2016 at 11:09 AM, Alan Hayward <alan.hayward@arm.com>
>>wrote:
>>> This patch is a reworking of the previous vectorize inductions that are
>>> live
>>> after the loop patch.
>>> It now supports SLP and an optimisation has been moved to patch [3/3].
>>>
>>>
>>> Stmts which are live (ie: defined inside a loop and then used after the
>>> loop)
>>> are not currently supported by the vectorizer.  In many cases
>>> vectorization can
>>> still occur because the SCEV cprop pass will hoist the definition of the
>>> stmt
>>> outside of the loop before the vectorizor pass. However, there are
>>>various
>>> cases SCEV cprop cannot hoist, for example:
>>>   for (i = 0; i < n; ++i)
>>>     {
>>>       ret = x[i];
>>>       x[i] = i;
>>>     }
>>>    return i;
>>>
>>> Currently stmts are marked live using a bool, and the relevant state
>>>using
>>> an
>>> enum. Both these states are propagated to the definition of all uses of
>>>the
>>> stmt. Also, a stmt can be live but not relevant.
>>>
>>> This patch vectorizes a live stmt definition normally within the loop
>>>and
>>> then
>>> after the loop uses BIT_FIELD_REF to extract the final scalar value from
>>> the
>>> vector.
>>>
>>> This patch adds a new relevant state (vect_used_only_live) for when a
>>>stmt
>>> is
>>> used only outside the loop. The relevant state is still propagated to
>>>all
>>> it's
>>> uses, but the live bool is not (this ensures that
>>> vectorizable_live_operation
>>> is only called with stmts that really are live).
>>>
>>> Tested on x86 and aarch64.
>>
>>+  /* If STMT is a simple assignment and its inputs are invariant, then
>>it can
>>+     remain in place, unvectorized.  The original last scalar value that
>>it
>>+     computes will be used.  */
>>+  if (is_simple_and_all_uses_invariant (stmt, loop_vinfo))
>>     {
>>
>>so we can't use STMT_VINFO_RELEVANT or so?  I thought we somehow
>>mark stmts we don't vectorize in the end.
>
> Itâs probably worth making clear that this check exists in the current
> GCC head - today vectorize_live_operation only supports the
> simple+invariant
> case and the SSE2 case.
> My patch simply moved the simple+invariant code into the new function
> is_simple_and_all_uses_invariant.

Ah, didn't notice this.

> Looking at this again, I think what we really care about isâ.
> *If the stmt is live but not relevant, we need to mark it to ensure it is
> vectorised.
> *If a stmt is simple+invariant then a live usage of it can either use the
> scalar or vectorized version.
>
> So for a live stmt:
> *If it is simple+invariant and not relevant, then it is more optimal to
> use the
> scalar version.
> *If it is simple+invariant and relevant then it is more optimal to use the
> vectorized version.
> *If it is not simple+invariant then we must always use the vectorized
> version.

So, if ! relevant (aka not vectorized) use the scalar version.  Otherwise
use the vectorized version.  The case we fail to handle is a live but not
relevant computation derived from a vectorized stmt (the !invariant and
!relevant case).  This case should be handled by sinking the computation
up to the vectorized stmt to the exit and handling the "last" live operation
by using the vectorized def.  In fact stmt sinking should have performed
this sinking already (so a check and not vectorizing this case just in
case that didn't happen is fine I think)

> Therefore, the patch as it stands is correct but not optimal. In patch 3/3
> for
> the code above (vectorize_live_operation) I can change the check to if not
> relevant then assert that it is not simple+invariant and return true.

Sounds good.

>
>
>>
>>+  lhs = (is_a <gphi *> (stmt)) ? gimple_phi_result (stmt)
>>+       : gimple_get_lhs (stmt);
>>+  lhs_type = TREE_TYPE (lhs);
>>+
>>+  /* Find all uses of STMT outside the loop.  */
>>+  auto_vec<gimple *, 4> worklist;
>>+  FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs)
>>+    {
>>+      basic_block bb = gimple_bb (use_stmt);
>>+
>>+      if (!flow_bb_inside_loop_p (loop, bb))
>>+       worklist.safe_push (use_stmt);
>>     }
>>+  gcc_assert (!worklist.is_empty ());
>>
>>as we are in loop-closed SSA there should be exactly one such use...?
>
> Yes, I should change this to assert that worklist is of length 1.
>
>>
>>+      /* Get the correct slp vectorized stmt.  */
>>+      vec_oprnds.create (num_vec);
>>+      vect_get_slp_vect_defs (slp_node, &vec_oprnds);
>>
>>As you look at the vectorized stmt you can directly use the
>>SLP_TREE_VEC_STMTS
>>array (the stmts lhs, of course), no need to export this function.
>
> Ok, I can change this.
>
>>
>>The rest of the changes look ok to me.
>
> Does that include PATCH 1/3  ?

I don't like how 1/3 ends up looking :/  So what was the alternative again?
I looked into 1/3 and what it takes to remove the 'stmt' argument and
instead pass in a vect_def_type.  It's a bit twisted and just adding another
argument (the loop_vinfo) doesn't help things here.

So - instead of 1/3 you might want to split out a function

tree
vect_get_vec_def_for_operand_1 (gimple *def_stmt, enum vect_def_type
dt, tree vectype)
{
  switch (dt)
    {
...
    }
}

and for constant/external force vectype != NULL.

Thanks,
Richard.

>>
>>Thanks,
>>Richard.
>>
>>
>>> gcc/
>>>         * tree-vect-loop.c (vect_analyze_loop_operations): Allow live
>>> stmts.
>>>         (vectorizable_reduction): Check for new relevant state.
>>>         (vectorizable_live_operation): vectorize live stmts using
>>>         BIT_FIELD_REF.  Remove special case for gimple assigns stmts.
>>>         * tree-vect-stmts.c (is_simple_and_all_uses_invariant): New
>>> function.
>>>         (vect_stmt_relevant_p): Check for stmts which are only used
>>>live.
>>>         (process_use): Use of a stmt does not inherit it's live value.
>>>         (vect_mark_stmts_to_be_vectorized): Simplify relevance
>>>inheritance.
>>>         (vect_analyze_stmt): Check for new relevant state.
>>>         *tree-vect-slp.c (vect_get_slp_vect_defs): Make global
>>>         *tree-vectorizer.h (vect_relevant): New entry for a stmt which
>>>is
>>> used
>>>         outside the loop, but not inside it.
>>>
>>>         testsuite/
>>>         * gcc.dg/tree-ssa/pr64183.c: Ensure test does not vectorize.
>>>         * testsuite/gcc.dg/vect/no-scevccp-vect-iv-2.c: Remove xfail.
>>>         * gcc.dg/vect/vect-live-1.c: New test.
>>>         * gcc.dg/vect/vect-live-2.c: New test.
>>>         * gcc.dg/vect/vect-live-3.c: New test.
>>>         * gcc.dg/vect/vect-live-4.c: New test.
>>>         * gcc.dg/vect/vect-live-5.c: New test.
>>>         * gcc.dg/vect/vect-live-slp-1.c: New test.
>>>         * gcc.dg/vect/vect-live-slp-2.c: New test.
>>>         * gcc.dg/vect/vect-live-slp-3.c: New test.
>>>         * gcc.dg/vect/vect-live-slp-4.c: New test.
>>>
>>>
>>>
>>> Alan.
>>>
>>
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]