This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST


On Wed, Sep 2, 2015 at 10:52 AM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal
> Sent: Wednesday, August 19, 2015 2:53 PM
> To: Richard Biener
> Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST
>
>
>
> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal
> Sent: Monday, August 17, 2015 4:03 PM
> To: Richard Biener
> Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST
>
>
>
> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Friday, August 14, 2015 9:59 PM
> To: Ajit Kumar Agarwal
> Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST
>
> On August 14, 2015 5:03:58 PM GMT+02:00, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>>
>>
>>-----Original Message-----
>>From: Richard Biener [mailto:richard.guenther@gmail.com]
>>Sent: Monday, August 03, 2015 2:59 PM
>>To: Ajit Kumar Agarwal
>>Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
>>Vidhumouli Hunsigida; Nagaraju Mekala
>>Subject: Re: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST
>>
>>On Sun, Aug 2, 2015 at 4:13 PM, Ajit Kumar Agarwal
>><ajit.kumar.agarwal@xilinx.com> wrote:
>>> All:
>>>
>>> The definition of the following macro that determine the statement
>>cost that adds to vectorization cost.
>>>
>>> #define TARGET_VECTORIZE_ADD_STMT_COST.
>>>
>>> In the implementation of the above macro the following is done for
>>many vectorization supported architectures like i386, ARM.
>>>
>>> if (where == vect_body && stmt_info && stmt_in_inner_loop_p
>>(stmt_info))
>>>         count *= 50;  /* FIXME.  */
>>>
>>> I have the  following questions.
>>>
>>> 1. Why the multiplication factor of 50 is choosen?
>>
>>>>It's a wild guess.  See
>>tree-vect-loop.c:vect_get_single_scalar_iteration_cost.
>>
>>> 2. The comment mentions that the inner loop relative to the loop
>>being
>>> vectorized is added more weight. If more weight is added to the inner
>>
>>> loop for the loop being vectorized, the chances of vectorizing the
>>inner loop decreases. Why the inner loop cost is increased with
>>relative to the loop being vectorized?
>>
>>>>In fact adding more weight to the inner loop increases the chance of
>>vectorizing it (if vectorizing the inner loop is profitable).
>>>>Both scalar and vector cost get biased by a factor of 50 (we assume
>>50 iterations of the inner loop for one iteration of the outer loop),
>>so a non-profitable >>vectorization in the outer loop can be offsetted
>>by profitable inner loop vectorization.
>>
>>>>Yes, '50' can be improved if we actually know the iteration count of
>>the inner loop or if we have profile-feedback.
>>
>>Instead of vector and scalar cost get biased by a factor of 50, Can the
>>benefit of vectorization calculated as follows
>>
>>Benefit = scalar cost - vector cost/VF; Cost = 0; For ( I = 1; I < N;
>>i++) {
>>    Cost = cost + (final_value - Initial value)/steps; }
>>
>>Benefit = Benefit * cost;
>>
>>Where
>>N = No. of levels of the loop;
>>Final_value = Final iteration count of the loop.
>>Initial_value = Initial Iteration count of the loop.
>>Steps = steps of the iteration for the loop.
>>VF = vectorization factor.
>>
>>Thus increase in the Levels of the loops increases  the benefit of
>>vectorization. Also if the scalar cost is more than the vectorization
>>cost then the Scalar cost - vector cost /VF increases with the same
>>vectorization Factor thus increasing the benefit of vectorization.
>
>>>Sure.  But the number of iterations may only be available symbolically, thus the cost be only useful for the dynamic check at runtime.  A better static >>estimate would also be useful.
>
>>>Thanks. For the cases the loop bound can be known at the compile time, through Value Range Analysis. Already GCC uses the value range  >>Information/Analysis To calculate the Loop bound. We can use the same loop bound info to get the static estimate on the number of iterations.
>
>  >> Based on the above estimates, the above cost calculation as I have mentioned can be used for Vectorization cost Analysis.
>
>>>On top of the above, the vectorizer cannot vectorize the loops if the trip count or iteration count is not known. In order to have the number Of iterations for >>vectorizer cost calculation, it's always  true the trip count or iteration count is known. The summation of iteration count of all the Loop levels where the >>iteration count is known, gives the static estimate and included in the above vectorization cost. For the Loops where iteration or trip count is not known, the >>vectorizer cannot vectorize and iteration count of such cases can be neglected for the vectorization cost calculation.
>
>>>Only SLP or partial vectorization is possible where its considers the isomorphic operations instead of vectorization based on trip or iteration Count for the >Loops.
>
> To support the above explanation  the following code throws a message "Not Vectorized " when the number of iterations determined by vect_get_loop_niters() is undetermined and the chrec representation from the scalar evolution in the form of {"base", "+","step"} of tree
> representation undetermined the number of iterations.
>
> if (!number_of_iterations
>       || chrec_contains_undetermined (number_of_iterations))
>     {
>       if (dump_enabled_p ())
>         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>                          "not vectorized: number of iterations cannot be "
>                          "computed.\n");
>       if (inner_loop_vinfo)
>         destroy_loop_vec_info (inner_loop_vinfo, true);
>       return NULL;
>     }
>
> Thus the innerloop_iters assigned with 50 can be replaced with number_of_iterations got from vect_get_loop_niters(). Instead of assigning
> 50 I am going to replace with number_of_iteration determined by vect_get_loop_niters() if chrec for loop iteration is known and determined.
>
> For a better approximation on the cost heuristics , the summation of loop iterations at all levels of the loops can be replaced instead of
> Considering only the inner loop iteration counts.
>
> Let me know if you see any issues with the above change.

There are no issues if you properly make use of this everywhere (scan
for the use fo the magic number and don't forget the target cost
hooks).  Note that this means that loops with constant outer iteration
count and non-constant inner iteration count will be forced to have a
dynamic runtime check for profitablilty instead of a
compile-time check.  When only one of the loops has a constant
interation count one might still want to do a compile-time
check only if that provides sufficient information to decide profitability.

Richard.

> Thanks & Regards
> Ajit
>
>>>Thanks & Regards
>>>Ajit
>
> Thanks & Regards
> Ajit
>
> Richard.
>
>>Thanks & Regards
>>Ajit
>>
>>Richard.
>>
>>
>>> Thanks & Regards
>>> Ajit
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]