This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST



-----Original Message-----
From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal
Sent: Wednesday, August 19, 2015 2:53 PM
To: Richard Biener
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST



-----Original Message-----
From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal
Sent: Monday, August 17, 2015 4:03 PM
To: Richard Biener
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST



-----Original Message-----
From: Richard Biener [mailto:richard.guenther@gmail.com]
Sent: Friday, August 14, 2015 9:59 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

On August 14, 2015 5:03:58 PM GMT+02:00, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>
>
>-----Original Message-----
>From: Richard Biener [mailto:richard.guenther@gmail.com]
>Sent: Monday, August 03, 2015 2:59 PM
>To: Ajit Kumar Agarwal
>Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; 
>Vidhumouli Hunsigida; Nagaraju Mekala
>Subject: Re: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST
>
>On Sun, Aug 2, 2015 at 4:13 PM, Ajit Kumar Agarwal 
><ajit.kumar.agarwal@xilinx.com> wrote:
>> All:
>>
>> The definition of the following macro that determine the statement
>cost that adds to vectorization cost.
>>
>> #define TARGET_VECTORIZE_ADD_STMT_COST.
>>
>> In the implementation of the above macro the following is done for
>many vectorization supported architectures like i386, ARM.
>>
>> if (where == vect_body && stmt_info && stmt_in_inner_loop_p
>(stmt_info))
>>         count *= 50;  /* FIXME.  */
>>
>> I have the  following questions.
>>
>> 1. Why the multiplication factor of 50 is choosen?
>
>>>It's a wild guess.  See
>tree-vect-loop.c:vect_get_single_scalar_iteration_cost.
>
>> 2. The comment mentions that the inner loop relative to the loop
>being
>> vectorized is added more weight. If more weight is added to the inner
>
>> loop for the loop being vectorized, the chances of vectorizing the
>inner loop decreases. Why the inner loop cost is increased with 
>relative to the loop being vectorized?
>
>>>In fact adding more weight to the inner loop increases the chance of
>vectorizing it (if vectorizing the inner loop is profitable).
>>>Both scalar and vector cost get biased by a factor of 50 (we assume
>50 iterations of the inner loop for one iteration of the outer loop), 
>so a non-profitable >>vectorization in the outer loop can be offsetted 
>by profitable inner loop vectorization.
>
>>>Yes, '50' can be improved if we actually know the iteration count of
>the inner loop or if we have profile-feedback.
>
>Instead of vector and scalar cost get biased by a factor of 50, Can the 
>benefit of vectorization calculated as follows
>
>Benefit = scalar cost - vector cost/VF; Cost = 0; For ( I = 1; I < N;
>i++) {
>    Cost = cost + (final_value - Initial value)/steps; }
>
>Benefit = Benefit * cost;
>
>Where
>N = No. of levels of the loop;
>Final_value = Final iteration count of the loop.
>Initial_value = Initial Iteration count of the loop.
>Steps = steps of the iteration for the loop.
>VF = vectorization factor.
>
>Thus increase in the Levels of the loops increases  the benefit of 
>vectorization. Also if the scalar cost is more than the vectorization 
>cost then the Scalar cost - vector cost /VF increases with the same 
>vectorization Factor thus increasing the benefit of vectorization.

>>Sure.  But the number of iterations may only be available symbolically, thus the cost be only useful for the dynamic check at runtime.  A better static >>estimate would also be useful.

>>Thanks. For the cases the loop bound can be known at the compile time, through Value Range Analysis. Already GCC uses the value range  >>Information/Analysis To calculate the Loop bound. We can use the same loop bound info to get the static estimate on the number of iterations.

 >> Based on the above estimates, the above cost calculation as I have mentioned can be used for Vectorization cost Analysis.

>>On top of the above, the vectorizer cannot vectorize the loops if the trip count or iteration count is not known. In order to have the number Of iterations for >>vectorizer cost calculation, it's always  true the trip count or iteration count is known. The summation of iteration count of all the Loop levels where the >>iteration count is known, gives the static estimate and included in the above vectorization cost. For the Loops where iteration or trip count is not known, the >>vectorizer cannot vectorize and iteration count of such cases can be neglected for the vectorization cost calculation.

>>Only SLP or partial vectorization is possible where its considers the isomorphic operations instead of vectorization based on trip or iteration Count for the >Loops.

To support the above explanation  the following code throws a message "Not Vectorized " when the number of iterations determined by vect_get_loop_niters() is undetermined and the chrec representation from the scalar evolution in the form of {"base", "+","step"} of tree 
representation undetermined the number of iterations.
 
if (!number_of_iterations
      || chrec_contains_undetermined (number_of_iterations))
    {
      if (dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "not vectorized: number of iterations cannot be "
                         "computed.\n");
      if (inner_loop_vinfo)
        destroy_loop_vec_info (inner_loop_vinfo, true);
      return NULL;
    }

Thus the innerloop_iters assigned with 50 can be replaced with number_of_iterations got from vect_get_loop_niters(). Instead of assigning
50 I am going to replace with number_of_iteration determined by vect_get_loop_niters() if chrec for loop iteration is known and determined.

For a better approximation on the cost heuristics , the summation of loop iterations at all levels of the loops can be replaced instead of 
Considering only the inner loop iteration counts.

Let me know if you see any issues with the above change.

Thanks & Regards
Ajit

>>Thanks & Regards
>>Ajit

Thanks & Regards
Ajit

Richard.

>Thanks & Regards
>Ajit
>
>Richard.
>
>
>> Thanks & Regards
>> Ajit



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]