[PATCH 1v2/3][vect] Add main vectorized loop unrolling

Richard Biener rguenther@suse.de
Mon Nov 22 12:39:19 GMT 2021


On Mon, 22 Nov 2021, Andre Vieira (lists) wrote:

> 
> On 12/11/2021 13:12, Richard Biener wrote:
> > On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >>
> >> This is the rebased and reworked version of the unroll patch.  I wasn't
> >> entirely sure whether I should compare the costs of the unrolled loop_vinfo
> >> with the original loop_vinfo it was unrolled of. I did now, but I wasn't
> >> too
> >> sure whether it was a good idea to... Any thoughts on this?
> > +  /* Apply the suggested unrolling factor, this was determined by the
> > backend
> > +     during finish_cost the first time we ran the analyzis for this
> > +     vector mode.  */
> > +  if (loop_vinfo->suggested_unroll_factor > 1)
> > +    {
> > +      poly_uint64 unrolled_vf
> > +       = LOOP_VINFO_VECT_FACTOR (loop_vinfo) *
> > loop_vinfo->suggested_unroll_factor;
> > +      /* Make sure the unrolled vectorization factor is less than the max
> > +         vectorization factor.  */
> > +      unsigned HOST_WIDE_INT max_vf = LOOP_VINFO_MAX_VECT_FACTOR
> > (loop_vinfo);
> > +      if (max_vf == MAX_VECTORIZATION_FACTOR || known_le (unrolled_vf,
> > max_vf))
> > +       LOOP_VINFO_VECT_FACTOR (loop_vinfo) = unrolled_vf;
> > +      else
> > +       return opt_result::failure_at (vect_location,
> > +                                      "unrolling failed: unrolled"
> > +                                      " vectorization factor larger than"
> > +                                      " maximum vectorization factor:
> > %d\n",
> > +                                      LOOP_VINFO_MAX_VECT_FACTOR
> > (loop_vinfo));
> > +    }
> > +
> >     /* This is the point where we can re-start analysis with SLP forced
> > off.  */
> >   start_over:
> >
> > So we're honoring suggested_unroll_factor here but you still have the
> > now unused hunk
> >
> > +vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal,
> > +                    unsigned *suggested_unroll_factor, poly_uint64 min_vf
> > = 2)
> >   {
> >
> > I also wonder whether vect_analyze_loop_2 could at least prune
> > suggested_unroll_factor as set by vect_analyze_loop_costing with its
> > knowledge of max_vf itself?  That would avoid using the at the moment
> > unused LOOP_VINFO_MAX_VECT_FACTOR?
> >
> > I think all the things you do in vect_can_unroll should be figured
> > out with the re-analysis, and I'd just amend vect_analyze_loop_1
> > with a suggested unroll factor parameter like it has main_loop_vinfo
> > for the epilogue case.  The main loop adjustment would the be in the
> >
> >            if (first_loop_vinfo == NULL)
> >              {
> >                first_loop_vinfo = loop_vinfo;
> >                first_loop_i = loop_vinfo_i;
> >                first_loop_next_i = mode_i;
> >              }
> Sounds good.
> >
> > spot only, adding
> >
> > if (loop_vinfo->suggested_unroll_factor != 1)
> >    {
> >      suggested_unroll_factor = loop_vinfo->suggested_unroll_factor;
> >      mode_i = first_loop_i;
> >      if (dump)
> >        dump_print ("Trying unrolling by %d\n");
> >      continue;
> >    }
> Not quite like this because of how we need to keep the suggestion given at
> finish_costs, put into suggested_unroll_factor, separate from how we tell
> vect_analyze_loop_1 that we are now vectorizing an unrolled vector loop, which
> we do by writing to loop_vinfo->suggested_unroll_factor. Perhaps I should
> renamed the latter, to avoid confusion? Let me know if you think that would
> help and in the mean-time this is what the patch looks like now. I'll follow
> up with a ChangeLog when we settle on the name & structure.

I think the patch looks OK, I'm only wondering about

+  if (first_loop_vinfo->suggested_unroll_factor > 1)
+    {
+      if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
+       {
+         if (dump_enabled_p ())
+           dump_printf_loc (MSG_NOTE, vect_location,
+                            "***** Re-trying analysis with first vector
mode"
+                            " %s for epilogue with partial vectors of"
+                            " unrolled first loop.\n",
+                            GET_MODE_NAME (vector_modes[0]));
+         mode_i = 0;

and the later done check for bigger VF than main loop - why would
we re-start at 0 rather than at the old mode?  Maybe we want to
remember the iterator value we started at when arriving at the
main loop mode?  So if we analyzed successfully with mode_i == 2,
then sucessfully at mode_i == 4 which suggested an unroll of 2,
re-start at the mode_i we continued after the mode_i == 2
successful analysis?  To just consider the "simple" case of
AVX vs SSE it IMHO doesn't make much sense to succeed with
AVX V4DF, succeed with SSE V2DF and figure it's better than V4DF AVX
but get a suggestion of 2 times unroll and then re-try AVX V4DF
just to re-compute that yes, it's worse than SSE V2DF?  You
are probably thinking of SVE vs ADVSIMD here but do we need to
start at 0?  Adding a comment to the code would be nice.

Thanks,
Richard.


More information about the Gcc-patches mailing list