[PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support
Richard Sandiford
richard.sandiford@arm.com
Thu May 11 11:04:42 GMT 2023
"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes:
> Hi, Richard. Since create_iv has been approved and soon will be commited after
> we bootstrap && regression.
>
> Now, I plan to send patch for "decrement IV".
>
> After reading your comments, I have several questions:
>
> 1.
>> if (use_bias_adjusted_len)
>> return rgl->bias_adjusted_ctrl;
>> + else if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
>> + OPTIMIZE_FOR_SPEED))
>> + {
>> + tree loop_len = rgl->controls[index];
>> + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
>> + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
>> + if (maybe_ne (nunits1, nunits2))
>> + {
>> + /* A loop len for data type X can be reused for data type Y
>> + if X has N times more elements than Y and if Y's elements
>> + are N times bigger than X's. */
>> + gcc_assert (multiple_p (nunits1, nunits2));
>> + unsigned int factor = exact_div (nunits1, nunits2).to_constant ();
>> + gimple_seq seq = NULL;
>> + loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
>> + build_int_cst (iv_type, factor));
>> + if (seq)
>> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
>> + }
>> + return loop_len;
>> + }
>> else
>> return rgl->controls[index];
>> }
>
>> ...here. That is, the key isn't whether SELECT_VL is available,
>> but instead whether we've decided to use it for this loop (unless
>> I'm missing something).
>
> Let's me clarify it again:
>
> I do this here is for Case 2 SLP:
>
> Generate for len : _61 = _75 / 2;
> I think it is similar with ARM SVE using VIEW_CONVER_EXPR to view_convert the mask.
>
> You said we should not let SELECT_VL is available or not to decide it here.
> Could you teach me how to handle this code here? Should I add a target hook like:
> TARGET_SLP_LOOP_LEN_RDIV_BY_FACTOR_P ?
No. What I mean is: for each vectorised loop, we should make a decision,
in one place only, whether to use SELECT_VL-based control flow or
arithmetic-based control flow for that particular loop. That decision
depends partly on direct_internal_fn_supported_p (a necessary but not
sufficient condition), partly on whether the loop contains SLP nodes, etc.
We should then record that decision in the loop_vec_info so that it is
available to whichever code needs it.
This is similar to LOOP_VINFO_USING_PARTIAL_VECTORS_P etc.
Thanks,
Richard
More information about the Gcc-patches
mailing list