[PATCH] Yet another simple fix to enhance outer-loop vectorization.

Richard Biener richard.guenther@gmail.com
Wed Jun 17 12:28:00 GMT 2015


On Tue, Jun 16, 2015 at 4:12 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Thanks a lot Richard for your review.
>
> I presented updated patch which is not gated by force_vectorize.
> I added test on outer-loop in vect_enhance_data_refs_alignment
> and it returns false for it because we can not improve dr alighment
> through outer-loop peeling in general. So I assume that only
> versioning for alignment can be applied for targets do not support
> unaligned memory access.

@@ -998,7 +998,12 @@
   gimple stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);

+  /* Peeling of outer loops can't improve alignment.  */
+  if (loop_vinfo && LOOP_VINFO_LOOP (loop_vinfo)->inner)
+    return false;
+

but this looks wrong.  It depends on the context (DRs in the outer loop
can improve alignment by peeling the outer loop and we can still
peel the inner loop for alignment).

So IMHO the correct place to amend is vect_enhance_data_refs_alignment
(which it seems doesn't consider peeling the inner loop).

I'd say you should simply add

     || loop->inner)

to the

  /* Check if we can possibly peel the loop.  */
  if (!vect_can_advance_ivs_p (loop_vinfo)
      || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
    do_peeling = false;

check.

> I did not change tests for outer loops in slpeel_can_duplicate_loop_p
> as you proposed since it is not called outside vectorization.

There is still no reason for this complex condition, so please remove it.

_Please_ also generate diffs with -p, it is very tedious to see patch hunks
without a function name context.

You didn't explain why you needed the renaming changes as I don't
remember needing it when using the code from loop distribution.

> I also noticed one not-resolved issue with outer-loop peeling - we don't
> consider remainder for possible vectorization of inner-loop as we can see
> on the following example:
>
>   for (i = 0; i < n; i++) {
>     diff = 0;
>     for (j = 0; j < M; j++) {
>       diff += in[j+i]*coeff[j];
>     }
>     out[i] = diff;
>   }
>
> Is it worth to fix it?

You mean vectorizing the inner loop in the niter peel epilogue loop
of the outer loop?  Possibly yes.

Thanks,
Richard.

> 2015-06-16  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
> to allow renaming of PHI arguments on edges incoming from outer
> loop header, add corresponding check before start PHI iterator.
> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
> with true force_vectorize.  Set-up dominator for outer loop too.
> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
> was marked with force_vectorize and has restricted cfg.
> * tree-vect-data-refs.c (vector_alignment_reachable_p): Alignment can
> not be reachable for outer loops.
> (vect_enhance_data_refs_alignment): Add test on true value of
> do_peeling.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/vect/vect-outer-simd-2.c: New test.
>
> 2015-06-09 16:26 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Hi All,
>>>
>>> Here is a simple fix which allows duplication of outer loops to
>>> perform peeling for number of iterations if outer loop is marked with
>>> pragma omp simd.
>>>
>>> Bootstrap and regression testing did not show any new failures.
>>> Is it OK for trunk?
>>
>> Hmm, I don't remember needing to adjust rename_variables_in_bb
>> when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
>> on non-innermost loops...  (I just copied, I never called
>> slpeel_can_duplicate_loop_p though).
>>
>> So - you should just remove the loop->inner condition from
>> slpeel_can_duplicate_loop_p as it is used by non-vectorizer
>> code as well (yeah, I never merged the nested loop support
>> for loop distribution...).
>>
>> Index: tree-vect-loop.c
>> ===================================================================
>> --- tree-vect-loop.c    (revision 224100)
>> +++ tree-vect-loop.c    (working copy)
>> @@ -1879,6 +1879,10 @@
>>        return false;
>>      }
>>
>> +  /* Peeling for alignment is not supported for outer-loop vectorization.  */
>> +  if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
>> +    LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;
>>
>> I think you can't simply do this - if vect_enhance_data_refs_alignment
>> decided to peel for alignment then it has adjusted the DRs alignment
>> info already.  So instead of the above simply disallow peeling for
>> alignment in vect_enhance_data_refs_alignment?  Thus add
>> || ->inner to
>>
>>   /* Check if we can possibly peel the loop.  */
>>   if (!vect_can_advance_ivs_p (loop_vinfo)
>>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>>     do_peeling = false;
>>
>> ?
>>
>> I also can't see why the improvement has to be gated on force_vect,
>> it surely looks profitable to enable more outer loop vectorization in
>> general, no?
>>
>> How do the cost model calculations end up with peeling the outer loop
>> for niter?
>>
>> On targets which don't support unaligned accesses we're left with
>> versioning for alignment.  Isn't peeling for alignment better there?
>> Thus only disallow peeling for alignment if there is no unhandled
>> alignment?
>>
>> Thanks,
>> Richard.
>>
>>> ChangeLog:
>>>
>>> 2015-06-08  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>
>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>> to allow renaming of PHI arguments on edges incoming from outer
>>> loop header, add corresponding check before start PHI iterator.
>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>> with true force_vectorize.  Set-up dominator for outer loop too.
>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>> was marked with force_vectorize and has restricted cfg.
>>> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
>>> for outer loops.
>>>
>>> gcc/testsuite/ChangeLog:
>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.



More information about the Gcc-patches mailing list