This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [autovect][patch]Loop Versioning for Vectorization
- From: Dorit Naishlos <DORIT at il dot ibm dot com>
- To: Keith Besaw <kbesaw at us dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 26 Jan 2005 16:31:07 +0200
- Subject: Re: [autovect][patch]Loop Versioning for Vectorization
- Reply-to:
- Sensitivity:
Thanks, Keith.
I think we should introduce a flag that allows controlling whether
versioning takes place or not. We can enable it by default when
vectorization is on, but we want to have a way to disable it, since we
currently apply versioning pretty much blindly (we haven't incorporated a
cost model yet), so we'll be touching a lot more loops, increasing code
size on the way (and other penalties). -ftree-vect-loop-version ?
One small issue is that the testcases need to be updated, as more loops get
vectorized now. I get the following errors with the patch on
i686-pc-linux-gnu:
XPASS: gcc.dg/vect/vect-29.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-44.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-48.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-50.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-52.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-77.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-78.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-80.c scan-tree-dump-times vectorized 1 loops 1
XPASS: gcc.dg/vect/vect-ifcvt-1.c scan-tree-dump-times vectorized 2 loops 1
FAIL: gcc.dg/vect/vect-none.c scan-tree-dump-times vectorized 2 loops 1
Targets that don't support misalignment will have more of those XPASSes.
In mainline there are target key-words that allow easily to set dg-final
for sets of targets (thanks to Janis), so rather than manually going over
all the tests in autovect and making the required modifications, I suggest
to do one (or preferably both) of the following:
1) wait for next merge from mainline, at which point it will be much easier
to do this (I'll try to do the merge this week).
2) using the new flag I mentioned above: use -fno-tree-vect-loop-version in
vect.exp, so that the current testcases maintain their current behavior,
and add new tests for which we'll enable versioining.
> (vectorize_loops): Move call to rewrite_into_loop_closed_ssa so
> it's done each
> time it's needed rather than once at the end.
I hope to have a patch to preserve loop-closed-form during our loop-peeling
within the vectorizer, because we probably don't want to call
rewrite_into_loop_closed_ssa each time we vectorize a loop (using peeling).
thanks,
dorit
gcc-patches-owner@gcc.gnu.org wrote on 22/01/2005 09:21:35:
> Add loop versioning for vectorization. For example:
>
> for(i=0; i<N; ++i)
> {
> pa[i] = pb[i] + pc[i];
> }
>
> If the alignment of the pointers pa, pb, and pc cannot be determined at
> compile time
> then a runtime test can be inserted to check alignment and direct control
> to either a
> vector version or a scalar version of the loop. At a high level it might
> look something
> like:
>
> if ((&pa[0] % AlignConst == 0)
> && (&pb[0] % AlignConst == 0)
> && (&pc[0] % AlignConst == 0))
> {
> for(i=0; i<N/VectorizationFactor; ++i)
> {
> vpa[i] = vpb[i] + vpc[i];
> }
> }
> else
> {
> for(i=0; i<N; ++i)
> {
> pa[i] = pb[i] + pc[i];
> }
> }
>
> The constants AlignConst and VectorizationFactor are machine dependent.
> For example,
> if a vector is required to be aligned on a 16 byte boundary then
> AlignConst would be
> 16. If a vector vpa[i] contains 4 elements of pa[i] then
> VectorizationFactor would be 4.
>
> Tested on ppc
> No differences in make check or in SPEC.
>
> OK for autovect branch?
>
> Keith.
>
>
> Changelog:
> * tree-vectorizer.c (new_loop_vec_info): Initialize
> LOOP_VINFO_MAY_MISALIGN_STMTS.
> (destroy_loop_vec_info); varray_clear of
> LOOP_VINFO_MAY_MISALIGN_STMTS.
> (vect_create_cond_for_align_checks): New.
> (vect_transform_loop): Add calls to tree_ssa_loop_version and
> vect_create_cond_for_align_checks.
> (vect_build_dist_vector): Change loop_info to loop_vinfo as it is
> called elsewhere.
> (vect_update_misalignment_for_peel): New. Holds multiple use
> code.
> (vect_enhance_data_refs_alignment): Decide when to do loop
> versioning and
> update data structcures.
> (vect_analyze_data_refs_alignment): Fix comment.
> (vect_pattern_recog): Fix comment.
> (vectorize_loops): Move call to rewrite_into_loop_closed_ssa so
> it's done each
> time it's needed rather than once at the end.
> *tree-vectorizer.h(struct _loop_vec_info): Add fields ptr_mask
and
> may_misalign_stmts.
> (MAX_RUNTIME_ALIGNMENT_CHECKS): New named constant.
>
> [attachment "vect.1-21.diff" deleted by Dorit Naishlos/Haifa/IBM]