This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Merge epilog loop & loop version due to alias/alignment in vectorization?
- From: Bingfeng Mei <bmei at broadcom dot com>
- To: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Tue, 4 Feb 2014 16:27:56 +0000
- Subject: Merge epilog loop & loop version due to alias/alignment in vectorization?
- Authentication-results: sourceware.org; auth=none
Hi,
One of biggest issues we have with GCC vectorization is bloated code size.
For example, vectorized version is 2.5 times of non-vectorized one for the
following simple code. One reason is that GCC often creates one loop copy
because of aliasing/alignment and one epilog loop because of loop iteration
constraint.
void foo (int *a, int *b, int N)
{
int i;
for (i = 0; i < N; i++)
{
a[i] = b[i];
}
}
Looking closely, the epilog loop and alignement/aliasing loop are almost
identical, just different in initial values for some variables entering
the loop. Can they be merged into one in such situations? If yes, any
suggestion on how to implement it?
...
<bb 7>:
# i_39 = PHI <i_47(8), i_50(10)>
_41 = (long unsigned int) i_39;
_42 = _41 * 4;
_43 = a_7(D) + _42;
_44 = b_9(D) + _42;
_45 = *_44;
*_43 = _45;
i_47 = i_39 + 1;
if (N_4(D) > i_47)
goto <bb 8>;
else
goto <bb 15>;
<bb 8>:
goto <bb 7>;
<bb 9>:
# i_51 = PHI <i_13(6)>
tmp.6_56 = (int) ratio_mult_vf.5_38;
if (niters.3_34 == ratio_mult_vf.5_38)
goto <bb 16>;
else
goto <bb 10>;
<bb 10>:
# i_50 = PHI <tmp.6_56(9), 0(4)>
goto <bb 7>;
<bb 11>:
goto <bb 6>;
<bb 12>:
<bb 13>:
# i_24 = PHI <0(12), i_32(14)>
_26 = (long unsigned int) i_24;
_27 = _26 * 4;
_28 = a_7(D) + _27;
_29 = b_9(D) + _27;
_30 = *_29;
*_28 = _30;
i_32 = i_24 + 1;
if (N_4(D) > i_32)
goto <bb 14>;
else
goto <bb 17>;
...
Thanks,
Bingfeng