This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][vect] Disable vectorization of epilogues for loops with SIMDUID set
- From: "Andre Vieira (lists)" <andre dot simoesdiasvieira at arm dot com>
- To: Jakub Jelinek <jakub at redhat dot com>, Richard Biener <rguenther at suse dot de>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, richard dot sandiford at arm dot com, tobias at codesourcery dot com
- Date: Thu, 7 Nov 2019 14:26:29 +0000
- Subject: Re: [PATCH][vect] Disable vectorization of epilogues for loops with SIMDUID set
- References: <8be5d0a6-e14b-42af-6b46-738e5b760212@arm.com> <alpine.LSU.2.20.1908261410570.32458@zhemvz.fhfr.qr> <1f9dca66-d8ff-9b80-75b4-98df06e73b96@arm.com> <20191031165841.GL4650@tucnak> <63635e04-3faa-9b89-ca42-0a3b190f4106@arm.com> <nycvar.YFH.7.76.1911050807310.5566@zhemvz.fhfr.qr> <20191105071632.GM4650@tucnak>
Hi,
Rebased the patch on top of Richard Sandiford's patches, with his fixes
I can now allow for vectorization of epilogues after we match simdlen.
This will however not turn on epilogue vectorization in cases where we
specify a desired simdlen that is never matched. This would require
more work as before simdlen is matched we would need to analyze each
vector_size after creating a "first_loop_vinfo" twice: once as an
epilogue (for in the case we never match simdlen) and once as a main
loop (in case simdlen would match its VF). Maybe there is a different
way of doing it but I don't see it right now.
Bootstrapped and regression tested (also ran libgomp tests) for x86_64
and aarch64. Currently libgomp has 5 failures for aarch64, these are all
openacc tests. The first one I looked at is due to a reduction seemingly
performing too many iterations when defining '$acc parallel
vector_length(vl)' I am looking into it.
Is this OK for trunk?
Cheers,
Andre
gcc/ChangeLog:
2019-11-07 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_analyze_loop): Disable epilogue
vectorization for loops with SIMDUID set. Enable epilogue
vectorization for loops with SIMDLEN set after finding a main
loop with a VF that matches it.
On 05/11/2019 07:16, Jakub Jelinek wrote:
On Tue, Nov 05, 2019 at 08:07:53AM +0100, Richard Biener wrote:
I was using loop->simdlen to detect whether it was a SIMD loop and I don't
believe that was correct, as can be witnessed by the mass failures in libgomp.
My apologies for not running this, didn't think of it!
I found that these were failing because we do not handle vectorization of
epilogues correctly when SIMDUID is set. For now Jakub and I agreed to disable
epilogue vectorization for loops where SIMDUID is set until we have fixed
this. See further comments inline.
I bootstrapped it on aarch64 and x86_64, ran libgomp on both.
This OK for trunk?
OK. Can you remove the simdlen == 0 check as a followup?
Yeah, exactly, I wanted to ask what the point of the simdlen == 0 check is.
All a non-zero simdlen says is a user assertion that certain inter-loop
depencencies don't exist.
Jakub
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index dfa087ebb2cf01a5d21da0a921f8b6fc3d691ce9..22550ca2d6c56cce201ea422bfae5472a0d85f3a 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2455,11 +2455,15 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
delete loop_vinfo;
/* Only vectorize epilogues if PARAM_VECT_EPILOGUES_NOMASK is
- enabled, this is not a simd loop and it is the innermost loop. */
- vect_epilogues = (!loop->simdlen
+ enabled, SIMDUID is not set, it is the innermost loop and we have
+ either already found the loop's SIMDLEN or there was no SIMDLEN to
+ begin with.
+ TODO: Enable epilogue vectorization for loops with SIMDUID set. */
+ vect_epilogues = (!simdlen
&& loop->inner == NULL
&& PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
&& LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
+ && !loop->simduid
/* For now only allow one epilogue loop. */
&& first_loop_vinfo->epilogue_vinfos.is_empty ());