This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Merge epilog loop & loop version due to alias/alignment in vectorization?


On Tue, Feb 4, 2014 at 5:27 PM, Bingfeng Mei <bmei@broadcom.com> wrote:
> Hi,
> One of biggest issues we have with GCC vectorization is bloated code size.
> For example, vectorized version is 2.5 times of non-vectorized one for the
> following simple code. One reason is that GCC often creates one loop copy
> because of aliasing/alignment and one epilog loop because of loop iteration
> constraint.

One thing to improve is to reduce the cases where we apply peeling
for alignment - by more properly modelling the cost effect for example
(also by considering that when you align 'a' then you might spuriously
misalign 'b').

Another idea is (if the target supports misaligned accesses) to
do both prologue and epilogue in vector code by doing redundant
work (overlap with the first / last vector iterations) and thus avoid
creating a loop for the prologue / epilogue.  Of course that has
constraints on the kind of operations that are supported (likely
more difficult if reductions / inductions are involved or if there
are dependences to be honored).

Richard.

> void foo (int *a, int *b, int N)
> {
>   int i;
>   for (i = 0; i < N; i++)
>   {
>     a[i] = b[i];
>   }
> }
>
> Looking closely, the epilog loop and alignement/aliasing loop are almost
> identical, just different in initial values for some variables entering
> the loop. Can they be merged into one in such situations? If yes, any
> suggestion on how to implement it?
>
> ...
>   <bb 7>:
>   # i_39 = PHI <i_47(8), i_50(10)>
>   _41 = (long unsigned int) i_39;
>   _42 = _41 * 4;
>   _43 = a_7(D) + _42;
>   _44 = b_9(D) + _42;
>   _45 = *_44;
>   *_43 = _45;
>   i_47 = i_39 + 1;
>   if (N_4(D) > i_47)
>     goto <bb 8>;
>   else
>     goto <bb 15>;
>
>   <bb 8>:
>   goto <bb 7>;
>
>   <bb 9>:
>   # i_51 = PHI <i_13(6)>
>   tmp.6_56 = (int) ratio_mult_vf.5_38;
>   if (niters.3_34 == ratio_mult_vf.5_38)
>     goto <bb 16>;
>   else
>     goto <bb 10>;
>
>   <bb 10>:
>   # i_50 = PHI <tmp.6_56(9), 0(4)>
>   goto <bb 7>;
>
>   <bb 11>:
>   goto <bb 6>;
>
>   <bb 12>:
>
>   <bb 13>:
>   # i_24 = PHI <0(12), i_32(14)>
>   _26 = (long unsigned int) i_24;
>   _27 = _26 * 4;
>   _28 = a_7(D) + _27;
>   _29 = b_9(D) + _27;
>   _30 = *_29;
>   *_28 = _30;
>   i_32 = i_24 + 1;
>   if (N_4(D) > i_32)
>     goto <bb 14>;
>   else
>     goto <bb 17>;
> ...
>
> Thanks,
> Bingfeng


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]