[RFC] S/390: Alignment peeling prolog generation
Robin Dapp
rdapp@linux.vnet.ibm.com
Tue Apr 11 14:38:00 GMT 2017
Hi,
when looking at various vectorization examples on s390x I noticed that
we still peel vf/2 iterations for alignment even though vectorization
costs of unaligned loads and stores are the same as normal loads/stores.
A simple example is
void foo(int *restrict a, int *restrict b, unsigned int n)
{
for (unsigned int i = 0; i < n; i++)
{
b[i] = a[i] * 2 + 1;
}
}
which gets peeled unless __builtin_assume_aligned (a, 8) is used.
In tree-vect-data-refs.c there are several checks that involve costs in
the peeling decision none of which seems to suffice in this case. For a
loop with only read DRs there is a check that has been triggering (i.e.
disable peeling) since we implemented the vectorization costs.
Here, we have DR_MISALIGNMENT (dr) == -1 for all DRs but the costs
should still dictate to never peel. I attached a tentative patch for
discussion which fixes the problem by checking the costs for npeel = 0
and npeel = vf/2 after ensuring we support all misalignments. Is there a
better way and place to do it? Are we missing something somewhere else
that would preclude the peeling from happening?
This is not indended for stage 4 obviously :)
Regards
Robin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc-omit-peeling.diff
Type: text/x-patch
Size: 2442 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170411/db143875/attachment.bin>
More information about the Gcc-patches
mailing list