This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch] Optimize condition reductions where the result is an integer induction variable


On Wed, Nov 11, 2015 at 7:54 PM, Alan Hayward <alan.hayward@arm.com> wrote:
>
>
> On 11/11/2015 13:25, "Richard Biener" <richard.guenther@gmail.com> wrote:
>
>>On Wed, Nov 11, 2015 at 1:22 PM, Alan Hayward <alan.hayward@arm.com>
>>wrote:
>>> Hi,
>>> I hoped to post this in time for Mondayâs cut off date, but
>>>circumstances
>>> delayed me until today. Hoping if possible this patch will still be able
>>> to go in.
>>>
>>>
>>> This patch builds upon the change for PR65947, and reduces the amount of
>>> code produced in a vectorized condition reduction where operand 2 of the
>>> COND_EXPR is an assignment of a increasing integer induction variable
>>>that
>>> won't wrap.
>>>
>>>
>>> For example (assuming all types are ints), this is a match:
>>>
>>> last = 5;
>>> for (i = 0; i < N; i++)
>>>   if (a[i] < min_v)
>>>     last = i;
>>>
>>> Whereas, this is not because the result is based off a memory access:
>>> last = 5;
>>> for (i = 0; i < N; i++)
>>>   if (a[i] < min_v)
>>>     last = a[i];
>>>
>>> In the integer induction variable case we can just use a MAX reduction
>>>and
>>> skip all the code I added in my vectorized condition reduction patch -
>>>the
>>> additional induction variables in vectorizable_reduction () and the
>>> additional checks in vect_create_epilog_for_reduction (). From the patch
>>> diff only, it's not immediately obvious that those parts will be skipped
>>> as there is no code changes in those areas.
>>>
>>> The initial value of the induction variable is force set to zero, as any
>>> other value could effect the result of the induction. At the end of the
>>> loop, if the result is zero, then we restore the original initial value.
>>
>>+static bool
>>+is_integer_induction (gimple *stmt, struct loop *loop)
>>
>>is_nonwrapping_integer_induction?
>>
>>+  tree lhs_max = TYPE_MAX_VALUE (TREE_TYPE (gimple_phi_result (stmt)));
>>
>>don't use TYPE_MAX_VALUE.
>>
>>+  /* Check that the induction increments.  */
>>+  if (tree_int_cst_compare (step, size_zero_node) <= 0)
>>+    return false;
>>
>>tree_int_cst_sgn (step) == -1
>>
>>+  /* Check that the max size of the loop will not wrap.  */
>>+
>>+  if (! max_loop_iterations (loop, &ni))
>>+    return false;
>>+  /* Convert backedges to iterations.  */
>>+  ni += 1;
>>
>>just use max_stmt_executions (loop, &ni) which properly checks for
>>overflow
>>of the +1.
>>
>>+  max_loop_value = wi::add (wi::to_widest (base),
>>+                           wi::mul (wi::to_widest (step), ni));
>>+
>>+  if (wi::gtu_p (max_loop_value, wi::to_widest (lhs_max)))
>>+    return false;
>>
>>you miss a check for the wi::add / wi::mul to overflow.  You can use
>>extra args to determine this.
>>
>>Instead of TYPE_MAX_VALUE use wi::max_value (precision, sign).
>>
>>I wonder if you want to skip all the overflow checks for
>>TYPE_OVERFLOW_UNDEFINED
>>IV types?
>>
>
> Ok with all the above.
>
> Tried using max_value () but this gave me a wide_int instead of a
> widest_int.
> Instead I've replaced with min_precision and GET_MODE_BITSIZE.
>
> Added an extra check for when the type is TYPE_OVERFLOW_UNDEFINED.

+         /* Set the loop-entry arg of the reduction-phi.  */
+
+         if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
+               == INTEGER_INDUC_COND_REDUCTION)

extra vertical space

+             tree zero = build_int_cst ( TREE_TYPE (vec_init_def_type), 0);
+             tree zero_vec = build_vector_from_val (vec_init_def_type, zero);
+

build_zero_cst (vec_init_def_type);

+         else
+           {
+             add_phi_arg (as_a <gphi *> (phi), vec_init_def,
                       loop_preheader_edge (loop), UNKNOWN_LOCATION);
+           }

no {}s around single stmts

+         tree comparez = build2 (EQ_EXPR, boolean_type_node, new_temp, zero);

please no l33t speech

+         tmp = build3 (COND_EXPR, scalar_type, comparez, initial_def,
+                       new_temp);
+
+         epilog_stmt = gimple_build_assign (new_scalar_dest, tmp);
+         new_temp = make_ssa_name (new_scalar_dest, epilog_stmt);
+         gimple_assign_set_lhs (epilog_stmt, new_temp);

epilog_stmt = gimple_build_assign (make_ssa_name (new_scalar_dest),
                                                    COND_EXPR,
compare, initial_def, new_temp);


+  /* Check that the max size of the loop will not wrap.  */
+
+  if (TYPE_OVERFLOW_UNDEFINED (lhs_type))
+    {
+      return (GET_MODE_BITSIZE (TYPE_MODE (lhs_type))
+             >= GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (base))));

this mode check will always be true as lhs_type and base are from the
same PHI node.

+  return (wi::min_precision (max_loop_value, TYPE_SIGN (lhs_type))
+         <= GET_MODE_BITSIZE (TYPE_MODE (lhs_type)));

please use TYPE_PRECISION (lhs_type) instead.

Ok with those changes.

Thanks,
Richard.

>
>
> Alan.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]