This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
- From: Richard Biener <rguenther at suse dot de>
- To: "Andre Vieira (lists)" <andre dot simoesdiasvieira at arm dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 26 Aug 2019 14:41:05 +0200 (CEST)
- Subject: Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
- References: <385547e6-abbd-3633-ad69-d4fb6e604c97@arm.com>
On Fri, 23 Aug 2019, Andre Vieira (lists) wrote:
> Hi,
>
> This patch is an improvement on my last RFC. As you pointed out, we can do
> the vectorization analysis of the epilogues before doing the transformation,
> using the same approach as used by openmp simd. I have not yet incorporated
> the cost tweaks for vectorizing the epilogue, I would like to do this in a
> subsequent patch, to make it easier to test the differences.
>
> I currently disable the vectorization of epilogues when versioning for
> iterations. This is simply because I do not completely understand how the
> assumptions are created and couldn't determine whether using skip_vectors with
> this would work. If you don't think it is a problem or have a testcase to
> show it work I would gladly look at it.
I don't think there's any problem. Basically the versioning condition
is if (can_we_compute_niter), most of the time it is an extra
condition from niter analysis under which niter is for example zero.
This should also be the same with all vector sizes.
- delete loop_vinfo;
+ {
+ /* Set versioning threshold of the original LOOP_VINFO
based
+ on the last vectorization of the epilog. */
+ LOOP_VINFO_VERSIONING_THRESHOLD (first_loop_vinfo)
+ = LOOP_VINFO_VERSIONING_THRESHOLD (loop_vinfo);
+ delete loop_vinfo;
+ }
I'm not sure this works reliably since the order we process vector
sizes is under target control and not necessarily decreasing. I think
you want to keep track of the minimum here? Preferably separately
I guess.
>From what I see vect_analyze_loop_2 doesn't need vect_epilogues_nomask
and thus it doesn't change throughout the iteration.
else
- delete loop_vinfo;
+ {
+ /* Disable epilog vectorization if we can't determine the
epilogs can
+ be vectorized. */
+ *vect_epilogues_nomask &= vectorized_loops > 1;
+ delete loop_vinfo;
+ }
and this is a bit premature and instead it should be done
just before returning success? Maybe also storing the
epilogue vector sizes we can handle in the loop_vinfo,
thereby representing !vect_epilogues_nomask if there are no
such sizes which also means that
@@ -1013,8 +1015,13 @@ try_vectorize_loop_1 (hash_table<simduid_to_vf>
*&simduid_to_vf_htab,
/* Epilogue of vectorized loop must be vectorized too. */
if (new_loop)
- ret |= try_vectorize_loop_1 (simduid_to_vf_htab,
num_vectorized_loops,
- new_loop, loop_vinfo, NULL, NULL);
+ {
+ /* Don't include vectorized epilogues in the "vectorized loops"
count.
+ */
+ unsigned dont_count = *num_vectorized_loops;
+ ret |= try_vectorize_loop_1 (simduid_to_vf_htab, &dont_count,
+ new_loop, loop_vinfo, NULL, NULL);
+ }
can be optimized to not re-check all smaller sizes (but even assert
re-analysis succeeds to the original result for the actual transform).
Otherwise this looks reasonable to me.
Thanks,
Richard.
>
> Bootstrapped this and the next patch on x86_64 and aarch64-unknown-linux-gnu,
> with no regressions (after test changes in next patch).
>
> gcc/ChangeLog:
> 2019-08-23 Andre Vieira <andre.simoesdiasvieira@arm.com>
>
> PR 88915
> * gentype.c (main): Add poly_uint64 type to generator.
> * tree-vect-loop.c (vect_analyze_loop_2): Make it determine
> whether we vectorize epilogue loops.
> (vect_analyze_loop): Idem.
> (vect_transform_loop): Pass decision to vectorize epilogues
> to vect_do_peeling.
> * tree-vect-loop-manip.c (vect_do_peeling): Enable skip-vectors
> when doing loop versioning if we decided to vectorize epilogues.
> (vect-loop_versioning): Moved decision to check_profitability
> based on cost model.
> * tree-vectorizer.h (vect_loop_versioning, vect_do_peeling,
> vect_analyze_loop, vect_transform_loop): Update declarations.
> * tree-vectorizer.c: Include params.h
> (try_vectorize_loop_1): Initialize vect_epilogues_nomask
> to PARAM_VECT_EPILOGUES_NOMASK and pass it to vect_analyze_loop
> and vect_transform_loop. Also make sure vectorizing epilogues
> does not count towards number of vectorized loops.
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 247165 (AG München)