[PATCH][vect]Account for epilogue's peeling for gaps when checking if we have enough niters for epilogue
Andre Vieira (lists)
andre.simoesdiasvieira@arm.com
Fri Nov 8 14:48:00 GMT 2019
Hi,
As I mentioned in the patch to disable epilogue vectorization for loops
with SIMDUID set, there were still some aarch64 libgomp failures. This
patch fixes those.
The problem was that we were vectorizing a reduction that was only using
one of the parts from a complex number, creating data accesses with
gaps. For this we set PEELING_FOR_GAPS which forces us to peel an extra
scalar iteration.
What was happening in the testcase I looked at was that we had a known
niters of 10. The first VF was 4, leaving 10 % 4 = 2 scalar iterations.
The epilogue had VF 2, which meant the current code thought we could do
it. However, given the PEELING_FOR_GAPS it would create a scalar
epilogue and we would end up doing too many iterations, surprisingly 12
as I think the code assumed we hadn't created said epilogue.
I ran a local check where I upped the iterations of the fortran test to
11 and I see GCC vectorizing the epilogue with VF = 2 and a scalar
epilogue for one iteration, so that looks good too. I have transformed
it into a test that would reproduce the issue in C and without openacc
so I can run it in gcc's normal testsuite more easily.
Bootstrap on aarch64 and x86_64.
Is this OK for trunk?
Cheers,
Andre
gcc/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps
into account when checking if there are enough iterations to
vectorize epilogue.
gcc/testsuite/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gaps.patch
Type: text/x-patch
Size: 2230 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20191108/0667edc5/attachment.bin>
More information about the Gcc-patches
mailing list