[PATCH 1v2/3][vect] Add main vectorized loop unrolling

Richard Biener rguenther@suse.de
Thu Nov 25 12:46:13 GMT 2021


On Thu, 25 Nov 2021, Andre Vieira (lists) wrote:

> 
> On 24/11/2021 11:00, Richard Biener wrote:
> > On Wed, 24 Nov 2021, Andre Vieira (lists) wrote:
> >
> >> On 22/11/2021 12:39, Richard Biener wrote:
> >>> +  if (first_loop_vinfo->suggested_unroll_factor > 1)
> >>> +    {
> >>> +      if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >>> +       {
> >>> +         if (dump_enabled_p ())
> >>> +           dump_printf_loc (MSG_NOTE, vect_location,
> >>> +                            "***** Re-trying analysis with first vector
> >>> mode"
> >>> +                            " %s for epilogue with partial vectors of"
> >>> +                            " unrolled first loop.\n",
> >>> +                            GET_MODE_NAME (vector_modes[0]));
> >>> +         mode_i = 0;
> >>>
> >>> and the later done check for bigger VF than main loop - why would
> >>> we re-start at 0 rather than at the old mode?  Maybe we want to
> >>> remember the iterator value we started at when arriving at the
> >>> main loop mode?  So if we analyzed successfully with mode_i == 2,
> >>> then sucessfully at mode_i == 4 which suggested an unroll of 2,
> >>> re-start at the mode_i we continued after the mode_i == 2
> >>> successful analysis?  To just consider the "simple" case of
> >>> AVX vs SSE it IMHO doesn't make much sense to succeed with
> >>> AVX V4DF, succeed with SSE V2DF and figure it's better than V4DF AVX
> >>> but get a suggestion of 2 times unroll and then re-try AVX V4DF
> >>> just to re-compute that yes, it's worse than SSE V2DF?  You
> >>> are probably thinking of SVE vs ADVSIMD here but do we need to
> >>> start at 0?  Adding a comment to the code would be nice.
> >>>
> >>> Thanks,
> >> I was indeed thinking SVE vs Advanced SIMD where we end up having to
> >> compare
> >> different vectorization strategies, which will have different costs
> >> depending.
> >> The hypothetical case, as in I don't think I've come across one, is where
> >> if
> >> we decide to vectorize the main loop for V8QI and unroll 2x, yielding a VF
> >> of
> >> 16, we may then want to then use a predicated VNx16QI epilogue.
> > But this isn't the epilogue handling ...
> Am I misunderstanding the code here? To me it looks like this is picking what
> mode_i to start the 'while (1)' loop does the loop analysis for the epilogues?

Oops, my fault, yes, it does.  I would suggest to refactor things so
that the mode_i = first_loop_i case is there only once.  I also wonder
if all the argument about starting at 0 doesn't apply to the
not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well?  So
what's the reason to differ here?  So in the end I'd just change
the existing

  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
    {

to

  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
      || first_loop_vinfo->suggested_unroll_factor > 1)
    {

and maybe revisit this when we have an actual testcase showing that
doing sth else has a positive effect?

Thanks,
Richard.


More information about the Gcc-patches mailing list