[RFC] S/390: Alignment peeling prolog generation

Robin Dapp rdapp@linux.vnet.ibm.com
Tue Apr 11 14:38:00 GMT 2017


Hi,

when looking at various vectorization examples on s390x I noticed that
we still peel vf/2 iterations for alignment even though vectorization
costs of unaligned loads and stores are the same as normal loads/stores.

A simple example is

void foo(int *restrict a, int *restrict b, unsigned int n)
{
  for (unsigned int i = 0; i < n; i++)
    {
      b[i] = a[i] * 2 + 1;
    }
}

which gets peeled unless __builtin_assume_aligned (a, 8) is used.

In tree-vect-data-refs.c there are several checks that involve costs  in
the peeling decision none of which seems to suffice in this case. For a
loop with only read DRs there is a check that has been triggering (i.e.
disable peeling) since we implemented the vectorization costs.

Here, we have DR_MISALIGNMENT (dr) == -1 for all DRs but the costs
should still dictate to never peel. I attached a tentative patch for
discussion which fixes the problem by checking the costs for npeel = 0
and npeel = vf/2 after ensuring we support all misalignments. Is there a
better way and place to do it? Are we missing something somewhere else
that would preclude the peeling from happening?

This is not indended for stage 4 obviously :)

Regards
 Robin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc-omit-peeling.diff
Type: text/x-patch
Size: 2442 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170411/db143875/attachment.bin>


More information about the Gcc-patches mailing list