[Bug tree-optimization/105219] [12 Regression] SVE: Wrong code with -O3 -msve-vector-bits=128 -mtune=thunderx

Wed Apr 27 12:02:04 GMT 2022

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105219

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #13)
> (In reply to Richard Biener from comment #11)
> > int data[128];
> > 
> > void __attribute((noipa))
> > foo (int *data, int n)
> > {
> >   for (int i = 0; i < n; ++i)
> >     data[i] = i;
> > }
> > 
> > int main()
> > {
> >   for (int start = 0; start < 16; ++start)
> >     for (int n = 1; n < 3*16; ++n)
> >       {
> >         __builtin_memset (data, 0, sizeof (data));
> >         foo (&data[start], n);
> >         for (int j = 0; j < n; ++j)
> >           if (data[start + j] != j)
> >             __builtin_abort ();
> >       }
> >   return 0;
> > }
> > 
> > for example aborts with -O3 -mtune=intel -fno-vect-cost-model on x86_64,
> > the cost model disabling is necessary to have the epilogue vectorized.
> > Without a cost model we peel for the maximum number of aligned refs
> > but still use the target cost to decide whether peeling is worth at all.
> 
> This one started with r12-1181-g7ed1cd9665d8ca0f.

So the following fixes the x86 failure for me.  Can somebody check if that
also fixes the aarch64 issue observed?

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index d7bc34636bd..3b63ab7b669 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -9977,7 +9981,7 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple
*loop_vectorized_call)
                            lowest_vf) - 1
           : wi::udiv_floor (loop->nb_iterations_upper_bound + bias_for_lowest,
                             lowest_vf) - 1);
-      if (main_vinfo)
+      if (main_vinfo && !main_vinfo->peeling_for_alignment)
        {
          unsigned int bound;
          poly_uint64 main_iters